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INTRODUCTION 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/101,939 filed September 25, 1998. 

10 

Technical Field 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 

15 Background 

Through the development of plant genetic engineering techniques, it is now possible to 
produce transgenic varieties of plant species to provide plants which have novel and desirable 
characteristics. For example, it is now possible to genetically engineer plants for tolerance to 
environmental stresses, such as resistance to pathogens and tolerance to herbicides and to 

2 0 improve the quality characteristics of the plant, for example improved fatty acid compositions. 

However, the number of useful nucleotide sequences for the engineering of such 
characteristics is thus far limited and the speed with which new useful nucleotide sequences 
for engineering new characteristics is slow. 

The characterization of various acyltransferase proteins is useful for the further study 
25 of plant fatty acid synthesis systems and for the development of novel and/or alternative oils 
sources. Studies of plant mechanisms may provide means to further enhance, control, 
modify, or otherwise alter the total fatty acyl composition of triglycerides and oils. 
Furthermore, the elucidation of the factor(s) critical to the natural production of fatty acids in 
plants is desired, including the purification of such factors and the characterization of 

3 0 element(s) and/or cofactors which enhance the efficiency of the system. Of particular interest 

are the nucleic acid sequences of genes encoding proteins which may be useful for 
applications in genetic engineering. 
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SUMMARY OF THE INVENTION 

The present invention provides nucleic acid encoding for amino acid 
sequences for a class of proteins which are related to acyltransferase proteins. Such proteins 
are referred to herein as acyltransferase related or acyltransferase like proteins. 

By this invention, nucleic acid sequences encoding these acyltransferase related 
proteins may now be characterized with respect to enzyme activity. In particular, 
identification and isolation of nucleic acid sequences encoding for acyltransferase related 
proteins from Arabidopsis, yeast, corn, and soybean are provided. 

Thus, this invention encompasses acyltransferase related nucleic acid sequences and 
the corresponding amino acid sequences, and the use of these nucleic acid sequences in the 
preparation of oligonucleotides containing such acyltransferase related encoding sequences 
for analysis and recovery of plant acyltransferase related gene sequences. The acyltransferase 
related encoding sequence may encode a complete or partial sequence depending upon the 
intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. 

Of special interest are recombinant DN A constructs which provide for transcription or 
transcription and translation (expression) of the acyltransferase related sequences in host 
cells. In particular, constructs which are capable of transcription or transcription and 
translation in plant host cells are preferred. For some applications a reduction in sequences 
encoding acyltransferase related sequences may be desired. Thus, recombinant constructs 
may be designed having the acyltransferase related sequences in a reverse orientation for 
expression of an anti-sense sequence or use of co-suppression, also known as "transwitch", 
constructs may be useful. Such constructs may contain a variety of regulatory regions 
including transcriptional initiation regions obtained from genes preferentially expressed in 
plant seed tissue. For some uses, it may be desired to use the transcriptional and translational 
initiation regions of the acyltransferase related gene either with the acyltransferase related 
encoding sequence or to direct the transcription and translation of a heterologous sequence. 

Also considered in this invention are the plants and seeds containing the constructs 
and polynucleotides of this invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides the 204 amino acid conserved sequence profile identified from 
comparisons of glycerol-3-phosphate acyltransferase and various lysophosphatidic acid 
acyltransferase using PSI-BLAST. 

Figure 2 provides an amino acid sequence alignment for the acyltransferase 
sequences. The alignment shown is of the regions of the protein extending from about 30 
amino acids prior to the conserved H in the conserved sequence HXXXXD to 100 amino 
acids after, or downstream, of the P in the conserved PEG sequence motif of the 
acyltransferase-like sequences. 

Figure 3 provides schematics showing the relationship of the identified 
acyltransferases. The relationships described are derived from an alignment of the regions of 
the protein extending from about 30 amino acids prior to the conserved H in the conserved 
sequence HXXXXD to 100 amino acids after, or downstream, of the P in the conserved PEG 
sequence motif of the acyltransferase-like sequences. Figure 3A provide aphylogenetic tree 
showing the relationship of several acyltransferases. Figure 3B provides a table showing the 
percent similarities and percent divergence of the novel acyltransferases and known 
acyltransferases using the Clustal method with PAM250 residue weight table. 



DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the subject invention, nucleotide sequences are provided which are 
capable of coding sequences of amino acids, such as, a protein, polypeptide or peptide, which 
are related to nucleic acid sequences encoding acyltransferase proteins, referred to herein as 
acyltransferase-like or acyltransferase related. The novel nucleic acid sequences find use in 
the preparation of constructs to direct their expression in a host cell. Furthermore, the novel 
nucleic acid sequences may find use in the preparation of plant expression constructs to 
modify the fatty acid composition of a plant cell. 

In one embodiment of the present invention, nucleic acid sequences, also referred to 
herein as polynucleotides, are identified from databases which are related to acyltransferases. 
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Isolated proteins, Polypeptides and Polynucleotides 

A first aspect of the present invention relates to isolated acyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 
sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
transcribed, untranslated sequences, termination signals, ribosome binding sites, sequences 
that stabilize mRNA, introns, polyadenylation signals, and additional coding sequence that 
encodes additional amino acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
that control gene expression. 

The invention also includes polynucleotides of the formula: 
X-(Ri)„-(R2)-(R 3 )n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R, and R 3 
are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 
1000 and R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence 
selected from the group set forth in the Sequence Listing and preferably SEQ IDNOs: 1, 3, 5, 
7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. In the formula, R 2 is oriented so that its 5' end 
residue is at the left, bound to R,, and its 3' end residue is at the right, bound to R,. Any 
stretch of nucleic acid residues denoted by either R group, where R is greater than 1 , may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

The invention also relates to variants of the polynucleotides described herein that 
encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
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invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 
5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the 
invention are substituted, added or deleted, in any combination. Particularly preferred are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
or activities of the polynucleotide or polypeptide. 

Nucleotide sequences encoding acyltransferases may be obtained from natural sources 
or be partially or wholly artificially synthesized. They may directly correspond to an 
acyltransferase endogenous to a natural source or contain modified amino acid sequences, 
such as sequences which have been mutated, truncated, increased or the like. Acyltransferases 
may be obtained by a variety of methods, including but not limited to, partial or homogenous 
purification of protein extracts, protein modeling, nucleic acid probes, antibody preparations 
and sequence comparisons. Typically an acyltransferase will be derived in whole or in part 
from a natural source. A natural source includes, but is not limited to, prokaryotic and 
eukaryotic sources, including, bacteria, yeasts, plants, including algae, and the like. 

Of special interest are acyltransferases which are obtainable from eukaryotic sources, 
including those which are obtained, from plants, or from acyltransferases which are 
obtainable through the use of these sequences. "Obtainable" refers to those acyltransferases 
which have sufficiently similar sequences to that of the sequences provided herein to provide 
a biologically active protein of the present invention. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particularly highly preferred, with those at least 99% being the most highly 
preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that retain 
substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 
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The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 
stringent conditions to the above-described polynucleotides. As used herein, the terms 
"stringent conditions" and "stringent hybridization conditions" mean that hybridization will 
generally occur if there is at least 95% and preferably at least 97% identity between the 
sequences. An example of stringent hybridization conditions is overnight incubation at 42°C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 rnM trisodium citrate), 
50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 
micrograms/milliliter denatured, sheared salmon sperm DNA, followed by washing the 
hybridization support in O.lx SSC at approximately 65°C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, et al., Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly Chapter 1 1 . 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 
a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA, genomic DNA or mRNA to identify members of the library 
which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 
correspond to the N-terminal sequence of the polypeptide. The partial sequences so prepared 
can then be used as probes to obtain acyltransferase clones from a gene library prepared from 
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a cell source of interest. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular peptides, such probes may be used directly to screen gene libraries 
for gene sequences. In particular, screening of cDNA libraries inphage vectors is useful in 
such methods due to lower levels of background hybridization. 

Typically, a sequence obtainable from the use of nucleic acid probes will show 60- 
70% sequence identity between the target acyltransferase sequence and the encoding sequence 
used as a probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic acid 
sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic acid 
fragments are employed as probes (greater than about 100 bp), one may screen at lower 
stringencies in order to obtain sequences from the target sample which have 20-50% 
deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an acyltransferase enzyme, but should be at least about 10, preferably at least about 
15, and more preferably at least about 20 nucleotides. A higher degree of sequence identity is 
desired when shorter regions are used as opposed to longer regions. It may thus be desirable 
to identify regions of highly conserved amino acid sequence to design oligonucleotide probes 
for detecting and recovering other related genes. Shorter probes are often particularly useful 
for polymerase chain reactions (PCR), especially when highly conserved sequences can be 
identified. {See, Gould, et ah, PNAS USA (1989) 86:1934-1938). 

The skilled artisan will appreciate that, in many cases, an isolated cDNA sequence 
will be incomplete, in that the region coding for the polypeptide is truncated with respect to 
the 5' terminus of the cDNA. This is a consequence of the reverse transcriptase, an enzyme 
with low 'processivity' (a measure of the ability of the enzyme to remain attached to the 
template during the polymerization reaction) employed during the first strand cDNA 
synthesis. 

There are several methods available and are well know to the skilled artisan to obtain 
full-length cDNAs, or extend short cDNAs, for example those based on the method of Rapid 
Amplification of cDNA Ends (RACE) (see, for example, Frohman et al. (1988) Proc. Natl. 
Acad. Sci. USA 85:8998-9002). Recent modifications of the technique, exemplified by the 

Marathon™ technology (Clonetech Laboratories, Inc.) for example, have significantly 

simplified obtaining full-length cDNA sequences. 
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Another aspect of the present invention relates to isolated acyltransferase 
polypeptides. Such polypeptides include isolated polypeptides se, forth in the Sequence 
Listing, as well as polypeptides and fragments thereof, particu.arly those polypeptides whtch 
exhibit acyltransferase activity and also those polypeptides which have at least 50%, 60% or 
70% identity, preferably a. leas, 80% identity, more preferably at least 90% identity, and most 
preferably at leas, 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such polypeptides, 
wherein such portion of the polypeptide preferably includes a, leas. 30 amino acids and more 
preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but no. .muted 
,0 those described in Con> P u,a,ional Molecular Bi olo g y, Leak, A.M., ed., Oxford Universt.y 
Press New York (1988); Biocomputing: Informatics and Genome Project, Smtth, D.W., ed.. 
Academic Press, New York, 1993; Computer Analysis of Sequence Dam, Part 1, Gnfftn. 
A M. and Griffin, H.O., eds„ Humana Press, New Jersey (1994); Sequence Analysis in 
Molecular Biolo S y, von Heinje, G„ Academic Press (1987); Sequence Analysis Pruner, 
Gribskov. M. and Devereux, J., eds„ Stockton Press, New York (.991); and Cari.lo, H„ and 
Lipman, D., SIAM J Applied Mam, 48; .073 (1988). Methods to determine idenU.y are 
designed ,o give the largest match between the sequences tested. Moreover, methods to 
determine identity are codif.ed in publicly available programs. Computer programs whtch 
can be used to determine identity between two sequences include, but are not limited to, GCG 
(Devereux, J„ e. al„ Nucleic Acids Research 12(1);387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLAST*, and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson. Trends in Biotechnology, 12: 76-80 (1994); Binen, etal, Genome Analysis, 1: 
543-559 (1997)). The BLAST X program is publicly available from NCBI and other sources 
, 'BLAST Manual, Altschul, S„ el al„ NCBI NLM NIH, Bethesda, MD 20894; Altschnl, S„ e, 
al.. J. Mol. Biol., 215;403-410 (1990)). The well known Smith Waterman algorithm can also 
be used to determine identity- 
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Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, /. Mol. Biol. 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. 
Sci USA 89:10915-10919 (1992) 
Gap Penalty : 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty : 3 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 
X-(R,)„-(R 2 )-(R3)n-Y 

wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or 
a metal, R, and R 3 are any amino acid residue, n is an integer between 1 and 1000, and R 2 » 
an amino acid sequence of the invention, particularly an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably SEQ IDNOs: 2, 4, 6, 8, 1 1, 13, 15, 
17, 19, 21, 23, and 218-225. In the formula, R 2 is oriented so that its amino terminal residue 
is It the left, bound to R„ and its carboxy terminal residue is at the right, bound to R 3 . Any 
stretch of amino acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in 
, SEQ ID NOs: 1, 3, 5, 7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. 

The polypeptides of the present invention can be mature protein or can be part of a 

fusion protein. 
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Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 
of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with similar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the sequences set 
forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala, 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 
to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the 
invention. 

The polynucleotides and polypeptides of the invention can be used, for example, in 
the transformation of various host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the 
mature polypeptide (for example, when the mature form of the protein has more than one 
polypeptide chain). Such sequences can, for example, play a role in the processing of a 
protein from a precursor to a mature form, allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
that cellular enzymes can be used to remove any additional amino acids from the mature 
protein. 

A precursor protein, having the mature form of the polypeptide fused to one or more 
prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 
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The polynucleotide and polypeptide sequences can also be used to identify additional 
sequences which are homologous to the sequences of the present invention. The most 
preferable and convenient method is to store the sequence in a computer readable medium, 
for example, floppy disk, CD ROM, hard disk drives, external disk drives and DVD, and then 
to use the stored sequence to search a sequence database with well known searchmg tools. 
Examples of public databases include the DNA Database of Japan 
(DDBJ)(http://www .ddbj.nig.ac.jp/); Genebank 

QKgfla g ncbj n» m nih. g ov/weh/C,enbnnk/lndex.html); and the European Molecular 

Biology Laboratory Nucleic Acid Sequence Database (EMBL) 

gnmTAvww e^Uc ^ebi doc^embl db.html) . A number of different search algorithms are 
available to the skilled artisan, one example of which are the suite of programs referred to as 
BLAST programs. There are five implementations of BLAST, three designed for nucleoude 
sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein 
sequence queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 
(1994); Birren, et al, Genome Analysis, 1: 543-559 (1997)). Additional programs are 
available in the art for the analysis of identified sequences, such as sequence alignment 
programs, programs for the identification of more distantly related sequences, and the like, 
and are well known to the skilled artisan. 

Plant Constructs and Methods of Use 

Of interest in the present invention, is the use of the nucleotide sequences, or 
polynucleotides, in recombinant DNA constructs to direct the transcription or transcription 
and translation (expression) of the acyltransferase sequences of the present invention m a host 
cell. 

Of particular interest is the use of the nucleotide sequences, or polynucleotides, in 
recombinant DNA constructs to direct the transcription or transcription and translation 
(expression) of the acyltransferase sequences of the present invention in a host cell. The 
expression constructs generally comprise a promoter functional in a host cell operably linked 
, to a nucleic acid sequence encoding an acyltransferase of the present invention and a 
transcriptional termination region functional in a host cell. 

By "host cell" is meant a cell which contains a vector and supports the replication, 
and/or transcription or transcription and translation (expression) of the expression construct. 
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Host cells for use in the present invention can be prokaryotic cells, such as E. coli, or 
eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian cells. Preferably, host 
cells are monocotyledenous or dicotyledenous plant cells. 

Of particular interest in the present invention is the use of the polynucleotides of the 
present invention for the preparation of constructs to direct the transcription or transcription 
and translation of the nucleotide sequences encoding an acyltransferase in a host plant cell. 
Plant expression constructs generally comprise a promoter functional in a plant host cell 
operably linked to a nucleic acid sequence of the present and a transcriptional termination 
region functional in a host plant cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of promoters are constitutive promoters such as the CaMV35S or FMV35S 
promoters that yield high levels of expression in most plant organs. Enhanced or duplicated 
versions of the CaMV35S and FMV35S promoters are useful in the practice of this invention 
(Odell, et al. (1985) Nature 313:810-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression of the protein of interest in 
specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 
involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 
regulatory regions from such genes as napin (Kridl et al, Seed Sci. Res. 7:209:219 (1991)), 
phaseolin, zein, soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit 
of p-conglycinin (soy 7s, (Chen et al., Proc. Natl. Acad. Sci., 83:8560-8564 (1986))) and 
oleosin. 

It may be advantageous to direct the localization of proteins conferring acyltransferase 
to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the 
genes of interest of the present invention will be targeted to plastids, such as chloroplasts, for 
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expression, the constructs will also employ the use of sequences to direct the gene to the 
plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (PTP). In this manner, where the gene of interest is not directly inserted into 
the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 
derived from the gene of interest, or may be derived from a heterologous sequence having a 
CTP. Such transit peptides are known in the art. See, for example, Von Heijne et al. (1991) 
Plant Mol. Biol. Rep. 9:104-126; Clark et al. (1989) J. Biol. Chem. 264:17544-17550; della- 
Cioppa et al. (1987) Plant Physiol. 84:965-968; Romer et al. (1993) Biochem. Biophys. Res 
Commun. 796:1414-1421; and, Shah et al. (1986) Science 253:478-481. Additional transit 
peptides for the translocation of the protein to the endoplasmic reticulum (ER), or vacuole 
may also find use in the constructs of the present invention. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire acyltransferase protein, or a portion thereof. For example, 
where antisense inhibition of a given acyltransferase protein is desired, the entire sequence is 
not required. Furthermore, where acyltransferase sequences used in constructs are intended 
for use as probes, it may be advantageous to prepare constructs containing only a particular 
portion of a acyltransferase encoding sequence, for example a sequence which is discovered 
to encode a highly conserved acyltransferase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to antisense suppression (Smith, et al. (1988) Nature 334:724-726) , co-suppression (Napoli, 
et al. (1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and 
combinations of sense and antisense, such as those described by Waterhouse, et al. (1998) 
Proc. Natl. Acad. Sci. USA 95: 1 3959- 1 3964. Methods for the suppression of endogenous 
sequences in a host cell typically employ the transcription or transcription and translation of 
at least a portion of the sequence to be suppressed. Such sequences may be homologous to 
coding as well as non-coding regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided by the 
DNA sequence encoding the acyltransferase or a convenient transcription termination region 
derived from a different gene source, for example, the transcript termination region which is 
naturally associated with the transcript initiation region. The skilled artisan will recognize 
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that any convenient transcript termination region which is capable of terminating transcription 
in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
acyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 
transfected, or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a parent in a cross and exhibiting an altered genotype resulting from the presence of 
an introduced acyltransferase nucleic acid sequence. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transduction" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

Plant expression or transcription constructs having an acyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Plants of interest in the present invention include 
monocotyledenous and dicotyledenous plants. Most especially preferred are temperate 
oilseed crops. Plants of interest include, but are not limited to, rapeseed (Canola and High 
Erucic Acid varieties), sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, 
and com. Depending on the method for introducing the recombinant constructs into the host 
cell, other DNA sequences may be required. Importantly, this invention is applicable to 
dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or 
improved transformation and regulation techniques. 

As used herein, the term "plant" includes reference to whole plants, plant organs (for 
example, leaves, stems, roots, etc.), seeds, and plant cells and progeny of same. Plant cell, as 
used herein includes, without limitation, seeds suspension cultures, embryos, meristematic 
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regions, callus tissue, leaves roots shoots, gametophytes, sporophytes, pollen, and 
microspores. The class of plants which can be used in the methods of the present invention is 
generally as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledenous and dicotyledenous plants. Particularly preferred plants of 
5 interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, and corn. Most 
especially preferred plants include Brassica, soybean, and corn. 

As used herein, "transgenic plant" includes reference to a plant which comprises 
within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide 

10 is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The heterologous polynucleotide may be integrated into the genome alone or as 
part of a recombinant expression cassette. 'Transgenic" is used herein to include any cell, 
cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the 
presence of heterologous nucleic acid including those transgenics initially so altered as well 

15 as those created by sexual crosses or asexual propagation from the initial transgenic. 

Thus a plant having within its cells a heterologous polynucleotide is referred to herein 
as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the 
genome, or can be extra-chromosomal. Preferably, the polynucleotide of the present 
invention is stably integrated into the genome such that the polynucleotide is passed on to 

2 0 successive generations. The polynucleotide is integrated into the genome alone or as part of a 

recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, 
callus, tissue, plant part or plant, the genotype of which has been altered by the presence of 
heterologous nucleic acids including those transgenics initially so altered as well as those 
created by sexual crosses or asexual reproduction of the initial transgenics. 
25 As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that 

originates from a foreign species, or, if from the same species, is substantially modified from 
its native form in composition and/or genomic locus by deliberate human intervention. For 
example, a promoter operably linked to a heterologous structural gene is from a species 
different from that from which the structural gene was derived, or, if from the same species, 

3 0 one or both are substantially modified from their original form. A heterologous protein may 

originate from a foreign species, or, if from the same species, is substantially modified from 
its original form by deliberate human intervention. 
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As used herein, a "recombinant expression cassette" is a nucleic acid construct, 
generated recombinantly or synthetically, with a series of specified nucleic acid elements 
which permit transcription of a particular nucleic acid in a target cell. The recombinant 
expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, 
5 plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette 
portion of an expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter. 

It is contemplated that the gene sequences may be synthesized, either completely or in 
part, especially where it is desirable to provide plant-preferred sequences. Thus, all or a 

10 portion of the desired structural gene (that portion of the gene which encodes the 

acyltransferase protein) may be synthesized using codons preferred by a selected host. Host- 
preferred codons may be determined, for example, from the codons used most frequently in 
the proteins expressed in a desired host species. 

One skilled in the art will readily recognize that antibody preparations, nucleic acid 

15 probes (DNA and RNA) and the like may be prepared and used to screen and recover 

"homologous" or "related" acyltransferase from a variety of plant sources. Homologous 
sequences are found when there is an identity of sequence, which may be determined upon 
comparison of sequence information, nucleic acid or amino acid, or through hybridization 
reactions between a known acyltransferase and a candidate source. Conservative changes, 

20 such as Glu/Asp, Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in 

determining sequence homology. Amino acid sequences are considered homologous by as 
little as 25% sequence identity between the two complete mature proteins. (See generally, 
Doolittle, R.F., OF URFS and ORFS (University Science Books, CA, 1986.) 

Thus, other acyltransferase sequences can be obtained from the specific exemplified 

25 sequences provided herein. Furthermore, it will be apparent that one can obtain natural and 
synthetic sequences, including modified amino acid sequences and starting materials for 
synthetic-protein modeling from the exemplified sequences and from acyltransferases which 
are obtained through the use of such exemplified sequences. Modified amino acid sequences 
include sequences which have been mutated, truncated, increased and the like, whether such 

3 0 sequences were partially or wholly synthesized. Sequences which are actually purified from 
plant preparations or are identical or encode identical proteins thereto, regardless of the 
method used to obtain the protein or sequence, are equally considered naturally derived. 
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For immunological screening, antibodies to the acyltransferase protein can be 
prepared by injecting rabbits or mice with the purified protein or portion thereof, such 
methods of preparing antibodies being well known to those in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are more 
useful for gene isolation. Western analysis may be conducted to determine that a related 
protein is present in a crude extract of the desired plant species, as determined by cross- 
reaction with the antibodies to the acyltransferase protein. When cross-reactivity is observed, 
genes encoding the related proteins are isolated by screening expression libraries representing 
the desired plant species. Expression libraries can be constructed in a variety of commercially 
available vectors, including lambda gtll, as described in Sambrook, et aL {Molecular 
Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York). 

The nucleic acid sequences associated with acyltransferase proteins will find many 
uses. For example, recombinant constructs can be prepared which can be used as probes, or 
which will provide for expression of the acyltransferase protein in host cells to produce a 
ready source of the enzyme and/or to modify the composition of triglycerides found therein. 
Other useful applications may be found when the host cell is a plant host cell, either in vitro 
or in vivo. 

The modification of fatty acid compositions may also affect the fluidity of plant 
membranes. Different lipid concentrations have been observed in cold-hardened plants, for 
example. By this invention, one may be capable of introducing traits which will lend to chill 
tolerance. Constitutive or temperature inducible transcription initiation regulatory control 
regions may have special applications for such uses. 

As discussed above, nucleic acid sequence encoding an acyltransferase of this 
invention may include genomic, cDNA or mRNA sequence. By "encoding" is meant that the 
sequence corresponds to a particular amino acid sequence either in a sense or anti-sense 
orientation. By "extrachromosomal" is meant that the sequence is outside of the plant 
genome of which it is naturally associated. By "recombinant" is meant that the sequence 
contains a genetically engineered modification through manipulation via mutagenesis, 
restriction enzymes, and the like. 

Once the desired acyltransferase nucleic acid sequence is obtained, it may be 
manipulated in a variety of ways. Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, mutagenesis, etc. Thus, transitions, 
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transversions, deletions, and insertions may be performed on the naturally occurring 
sequence. In addition, all or part of the sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a modified amino acid sequence, or one 
or more codon mutations may be introduced to provide for a convenient restriction site or 
5 other purpose involved with construction or expression. The structural gene may be further 
modified by employing synthetic adapters, linkers to introduce one or more convenient 
restriction sites, or the like. 
, The nucleic acid or amino acid sequences encoding an acyltransferase of this 

invention may be combined with other non-native, or "heterologous", sequences in a variety 

10 of ways. By "heterologous" sequences is meant any sequence which is not naturally found 
joined to the acyltransferase, including, for example, combinations of nucleic acid sequences 
from the same plant which are not naturally found joined together. 

The DNA sequence encoding an acyltransferase of this invention may be employed in 
conjunction with all or part of the gene sequences normally associated with the 

15 acyltransferase. In its component parts, a DNA sequence encoding acyltransferase is 

combined in a DNA construct having, in the 5' to 3' direction of transcription, a transcription 
initiation control region capable of promoting transcription and translation in a host cell, the 
DNA sequence encoding plant acyltransferase and a transcription and translation termination 
region. 

20 Potential host cells include both prokaryotic cells, such as E.coli and eukaryotic cells 

such as yeast, insect, amphibian, or mammalian cells. A host cell may be unicellular or found 
in a multicellular differentiated or undifferentiated organism depending upon the intended 
use. Preferably, host cells of the present invention include plant cells, both 
monocotyledenous and dicotyledenous. Cells of this invention may be distinguished by 

25 having a sequence foreign to the wild-type cell present therein, for example, by having a 
recombinant nucleic acid construct encoding an acyltransferase therein. 

The methods used for the transformation of the host plant cell are not critical to the 
present invention. The transformation of the plant is preferably permanent, i.e. by integration 
of the introduced expression constructs into the host plant genome, so that the introduced 

3 0 constructs are passed onto successive plant generations. The skilled artisan will recognize 
that a wide variety of transformation techniques exist in the art, and new techniques are 
continually becoming available. Any technique that is suitable for the target host plant can be 
employed within the scope of the present invention. For example, the constructs can be 
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introduced in a variety of forms including, but not limited to as a strand of DNA, in a 
plasmid, or in an artificial chromosome. The introduction of the constructs into the target 
plant cells can be accomplished by a variety of techniques, including, but not limited to 
calcium-phosphate-DNA co-precipitation, electroporation, microinjection, Agrobacterium 
infection, liposomes or microprojectile transformation. The skilled artisan can refer to the 
literature for details and select suitable techniques for use in the methods of the present 
invention. 

Normally, included with the DNA construct will be a structural genfc having the 
necessary regulatory regions for expression in a host and providing for selection of 
transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 
where different conditions for selection are used for the different hosts. 

Where Agrobacterium is used for plant cell transformation, a vector may be used 
which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 
or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 
vir genes are present in the transformed Agrobacterium host. The armed plasmid can give a 
mixture of normal plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host 
plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 
will be inserted into a broad host range vector capable of replication in E. coli and 
Agrobacterium, there being broad host range vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, Ditta, et aL, (Proc. Nat. Acad. ScL, 
U.S.A. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector in E. coli, and 
the other in Agrobacterium. See, for example, McBride and Summerfelt (Plant Mol. Biol. 
(1990) 74:269-276), wherein the pRiHRI (Jouanin, et al. y Mol. Gen. Genet. (1985) 201:370- 
374) origin of replication is utilized and provides for added stability of the plant expression 
vectors in host Agrobacterium cells. 
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Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
5 particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 

10 forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which 

15 contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a nucleic acid sequence of the present invention, and at least one other 
construct having another DNA sequence encoding an enzyme are encompassed by the present 
invention. For example, the expression construct can be used to transform a plant at the same 
time as the second construct either by inclusion of both expression constructs in a single 

20 transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
first expression construct, or alternatively, transformed plants, one having the first construct 
and one having the second construct, can be crossed to bring the constructs together in the 
same plant. 

25 In general, acyltransferase proteins are active in the transfer of acyl groups from a 

donor to a variety of different substrates. For example, diacylglycerol acyltransferases add 
acyl groups to diacylglycerol to form triacyl glycerol (TAG), oracyl:CoA:cholesterol 
acyltransferase uses an acyl-CoA as a donor to transfer an acyl group to a sterol to form a 
sterol ester. Typically, the substrates include, but are not limited to glycerides, including 

3 0 mono and diglycerides, sterols, stanols, phosphatides, and the like. Donors include, but are 
not limited to acyl-CoA and acyl-ACP molecules. 
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The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 



5 

EXAMPLES 

Example 1: RNA Isolations 

10 Total RNA from the inflorescence and developing seeds of Arabidopsis thaliana is 

isolated for use in construction of complementary (cDNA) libraries. The procedure is an 
adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and S.J. Knapp, 
(1990) Plant Molec. Reporter, 8, 180-185). The following description assumes the use of lg 
fresh weight of tissue. Frozen seed tissue is powdered by grinding under liquid nitrogen. The 

15 powder is added to 10ml REC buffer (50mM Tris-HCl, pH 9, 0.8M NaCl, lOmM EDTA, 
0.5% w/v CTAB (cetyltrimethyl-ammonium bromide)) along with 0.2g insoluble 
polyvinylpolypyrrolidone, and ground at room temperature. The homogenate is centrifuged 
for 5 minutes at 12,000 xg to pellet insoluble material. The resulting supernatant fraction is 
extracted with chloroform, and the top phase is recovered. 

2 0 The RNA is then precipitated by addition of 1 volume RecP (50mM Tris-HCL pH9, 

lOmM EDTA and 0.5% (w/v) CTAB) and collected by brief centrifugation as before. The 
RNA pellet is redissolved in 0.4 ml of 1M NaCl. The RNA pellet is redissolved in water and 
extracted with phenol/chloroform. Sufficient 3M potassium acetate (pH 5) is added to make 
the mixture 0.3M in acetate, followed by addition of two volumes of ethanol to precipitate the 

25 RNA. After washing with ethanol, this final RNA precipitate is dissolved in water and stored 
frozen. 

Alternatively, total RNA may be obtained using TRIzol reagent (BRL- 
Lifetechnologies, Gaithersburg, MD) following the manufacturers protocol. The RNA 
precipitate is dissolved in water and stored frozen. 

30 



Example 2: Identification of Acyltransferase Homology Sequences 
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Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
protein databases using profiles as queries. The program used to query protein databases is 
profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of amino acid or nucleic acid sequences. The profile is used to query a 
sequence data set, i.e., a sequence database. The profile contains all the pertinent information 
for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 
standard query searches. The program used to query nucleotide databases with a protein 
profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional column that indicates the frame in which the 
best alignment occurred. 

The Smith- Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. 
By using the conserved peptide sequence information, E score values of greater than E-12 and 
E-8 are considered. For example, the EST sequence originally used to identify ATAT2 had 
an E score of 0.0094, while the EST sequence originally used to identify ATLPAAT1 had an 
E score of 0.0868. 

A protein sequence of glycerol-3-phosphate from£. coli (Swiss Prot Accession 
P00482) is used to search the NCBI non-redundant protein database using BLAST. In the 
first round of searches, other membrane forms of G3PAAT are identified. In subsequent PSI- 
BLAST searches (Altschul, et al. (1997) Nucleic Acids Res 25:3389-3402), LPAATs and 
other acyltransferases are identified. Using sequence alignment software programs, G3PAAT 
and different LPAAT amino acid sequences are aligned, and a profile is generated using a 
homologous sequence region, between amino acids 256 and 459 of the E. coli sequence. 

The identified 204 amino acid is used to query the protein database using PSI-BLAST. 
After 5 iterations of PSI-BLAST, the profile generated from this new query (Figure 1) 
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identified soluble forms of G3PAAT. Prior to this identification, no sequence homology had 
been identified between the membrane and soluble forms of G3PAAT. 



5 Example 3: Excision of PSI-BLAST Profile 

The profile generated from the queries using PSI-BLAST is excised from the hyper 
i text markup language (html) file. The worldwide web (www)/html interface to psiblast at 
\ ncbi stores the current generated profile matrix in a hidden field in the html file that is 
10 returned after each iteration of psiblast. However, this matrix has been encoded into string62 
(s62) format for ease of transport through html. String62 format is a simple conversion of the 
values of the matrix into html legal ascii characters. 

The encoded matrix width (x axis) is 26 characters, and comprise the consensus 
characters, the probabilities of each amino acid in the order A,B,C,D,E,F,G,H,I,K,L,M,N, 
15 P,Q,R,S,T,V,W,X,Y,Z (where B represents D and N, and Z represents Q and E, and X 
represents any amino acid), gap creation value, and gap extension value. 

The length (y axis) of the matrix corresponds to the length of the sequences identified 
by PSI-BLAST. The order of the amino acids corresponds to the conserved amino acid 
sequence of the sequences identified using PSI-BLAST, with the N-terminal end at the top of 

2 0 the matrix. The probabilities of other amino acids at that position are represented for each 

amino acid along the x axis, below the respective single letter amino acid abbreviation. 

i 

Thus, each row of the profile consists of the highest scoring (consensus) amino acid, 
followed by the scores for each possible amino acid at that position in sequence matrix, the 
score for opening a gap that that position, and the score for continuing a gap at that position. 
25 The string62 file is converted back into a profile for use in subsequent searches. The 

gap open field is set to 1 1 and the gap extension field is set to 1 along the x axis. The gap 
creation and gap extension values are known, based on the settings given to the PSI-BLAST 
algorithm. The matrix is exported to the standard GCG profile form. This format can be read 
by GenWeb. 

3 0 The algorithm used to convert the string62 formatted file to the matrix is outlined in 

Table 1. 
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Table 1 

1 . if encoded character z then the value is blast score min 

2. if encoded character Z then the value is blast score max 

3. else if the encoded character is uppercase then its value is (64-(ascii # of char)) 

4. else if the encoded character is a digit the value is ((ascii # of char)-48) 

5. else if the encoded character is not uppercase then the value is ((ascii # of char) - 87) 

6. ALL B positions are set to min of D and N amino acids at that row in sequence matrix 

7. ALL Z positions are set to min of Q amd E amino acids at that row in sequence matrix 

8. ALL X positions are set to min of all amino acids at that row in sequence matrix 

9. kBLAST_SCORE_MAX=999; 

10. kBLAST_SCORE_MIN=-999; 

1 1 . all gap opens are set to 1 1 

12. all gap lens are set to 1 



Example 4: Identification of Novel Acyl transferase Related Amino Acid Sequences 

The profile (Figure 1) is used in further queries to identify a number of previously 
unidentified proteins from yeast as novel acyltransferases. A protein is identified from an 
Arabidopsis protein sequence database (ATAT1) (SEQ ID NO:2). Sequences are also 
identified from nucleic acid databases (Table 2) 



Table 2 



Database ID Number 


BLAST Search Hits 


Log probability 


Saccharomyces cerevisiae 






gi 1078509 


Limnanthes putative LPAAT 


e- 10 (SEQ ID 


NO:217) 






gi 586485 


Limnanthes putative LPAAT 


e-13(SEQID 


NO:218) 
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gi 320748 


Limnanthes putative LPAAT 


e-19<SEQID 


NO:219) 






gi 2506920 


SUPPRESSES CTR1 (choline transport mutant) (SEQ ID NO:220) 


gi 549627 


similar to CTR1 


e-118(SEQID 


NO:221) 






gi 2133031 


unidentified 


(SEQ ID 


NO:222) 






gi 2132939 


unidentified 


(SEQ ID 


NO:223) 






gi 2132299 


TAFAZZIN 


e-14 (SEQ ID 


NO:224) 







In Table 2, the gi number is the database identifier, the middle column shows the 
results of BLAST searches against the NCBI NR protein database, and the log probability 
15 number shows represents the log of the probability of such a match occurring by random 
chance. These proteins, including the AT ATI protein sequence, are identified using the 
original PSI-BLAST search of the NCBI NR protein database. Thus, these proteins are novel 
acyltransferase related proteins with unidentified activities. 

The Arabidopsis acyltransferase sequence, herein referred to as AT ATI, is also 
2 0 identified using the original PSI-BLAST search of the NCBI NR protein database, and did not 
• have an annotated function. 

Additional Arabidopsis amino acid sequences related to acyltransferases are identified 
from the databases, referred to as ATAT2est, ATAT3est, ATAT4est, ATATSest, ATAT6est, 
ATAT7est, ATAT8est, ATAT9, AT AT 10, and AT ATI lest. Furthermore, Arabidopsis 
25 amino acid sequences are identified which demonstrate sequence similarity to known 

lysophosphatidic acid, referred to as ATLPAATL The sequences of ATAT9 and AT AT 10 
are identified from the database as genomic sequences, all other Arabidopsis sequences are 
identified as ESTs. 
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To obtain the entire coding region corresponding to the Arabidopsis acyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing acyltransferase related sequences. Primers are designed 
according to the respective Arabidopsis acyltransferase related sequences (Table 3) and used 
in Rapid Amplification of cDNA Ends (RACE) reactions (Frohman et ah (1988) Proc. Natl 
Acad. ScL USA 85:8998-9002) using the Marathon cDNA amplification kit (Clontech 
Laboratories Inc, Palo Alto, CA). Primers with an R designation are used for 5' RACE 
reactions, and primers with an F designation are used for 3' RACE reactions. 
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Table 3 
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10 



15 



20 



25 



30 



ATAT2 

ATAT2R 1 CCATCCGCTTCAAGGG AACG ACACCC ATCA (SEQ ID NO: 1 35) 

ATAT2R2 TCCCTGTCTTGCTTGATG AACTTAA AGCTTG (SEQ ID NO: 1 36) 

ATAT2R3 ACAGC AGGAGTGTCTGATG ATGGC AGATTC (SEQ ID NO: 1 37) 

ATAT3 



ATAT3R 1 ACTGGAGTTCCAGCCAAAAATGCACCTGTC (SEQ ID NO: 138) 
ATAT3R2 GATACACCCTTG AAATCAGGCG ATTTTGCT (SEQ ID NO: 1 39) 

ATAT4 



ATAT4R1 TTGCAAATTCAATTCCTGTTTCACCGGGCC (SEQ ID NO: 140) 
ATAT4R2 GTTTTCTGCTATTCCAGAAGGCGTCAACAA (SEQ ID NO: 141) 



ATAT5 



ATAT5R 1 CATTGAAGATCCGTCCGTGAAGTTNCCTTACC (SEQ ID NO: 142) 

ATAT5R2 TCGAGCTGTGATCGATGATTGGCTGTGAAG (SEQ ID NO: 143) 

ATAT5F1 GTCTCTTC A A A A AC AC AC AC AC ACGTCTCT (SEQ ID NO: 144) 

ATAT5F2 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 145) 



ATAT6 



H76348-F1 GTAGAGAGCCTTACTTGCTTCGGTTTAGTC (SEQ ID NO: 146) 
H76348-F2 ACGTCATCGTACCTGTTGCTATTGACTCAC (SEQ ID NO: 147) 
H76348-R1 ACTTTTCCATTGTCAGGGACTCCTCGACAC (SEQ ID NO: 148) 
H76348-R2 ACGGTGTAGGAAGGGAAAGGATTCAAAAGG (SEQ ID NO: 149) 



ATAT7 



ATTS0193-F1 GCGATpAACTACAGAGTCGGATTCTTCCTC (SEQ ID NO: 150) 
ATTS0193-F2 CCGGTTTACGAGATTACGTTCTTGAACCAG (SEQ ID NO: 151) 
ATTS0193-R1 CAATGGAGACAAGGCTCGAAAGTGCTAACC (SEQ ID NO: 152) 
ATTS0193-R2 ATTCTCTGAACATAGTTCGCCACGGTCATG (SEQ ID NO: 153) 



WO 00/18889 



28 



PCT/US99/22231 



ATAT8 

AA042618-F1 GAAATCCAACGCCTTCCCAATATCACTCTG (SEQ ID NO: 154) 
AA042618-F2 CTTCAACTTTCCATCAGGATCTTGGCACGT (SEQ ID NO: 155) 
AA042618-R1 ACCACTTGTTAGAGACCTTACCTGCTTAGG (SEQ ID NO: 156) 
AA042618-R2 TCCTACCTACACCATCCAATTTCTCGACCC (SEQ ID NO: 157) 

AT ATI 1 



AT ATI 1R1 CTGCGTCAAGTGAGCAACTCAGTTCTTGCA (SEQ ID NO: 158) 
AT AT 1 1 R2 TGGG A AGC AGC ACGTTGTTC AGT ATCGG A A (SEQ ID NO: 1 59) 
AT AT 1 1 R3 TAGCCTCTGTGT A ATCTGTGCCCTCGGGG A (SEQ ID NO: 1 60) 



From the nucleic acid sequences obtained from the RACE reactions, protein sequence 
is predicted for each nucleic acid sequence using Macvector software. Nucleic acid sequences 
15 are provided for AT ATI (SEQ ID NO: 1), ATAT2 (SEQ ID NO:3), ATAT3 (SEQ ID NO:5), 
ATAT4 (SEQ ID NO:7), AT ATS (SEQ ID NO:9), ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ 
ID NO:12), ATAT8 (SEQ ID NO:14), ATAT9 (SEQ ID NO:16), AT AT 10 (SEQ ID NO:18), 
AT ATI 1 (SEQ ID NO:20) and ATLPAAT1 (SEQ ID NO:22), respectively. 

The protein sequence derived from the AT ATI (SEQ ID NO: 2) nucleic acid sequence 
2 0 from Arabidopsis has a predicted molecular mass of 32.5 kDa, and a PI of 9.74. Alignment 
of the Arabidopsis acyltransferase with several LPAAT and G3PAAT shows that some of the 
domains that are conserved between LPAAT and G3PAAT are conserved in the new 
acyltransferase protein. 

The ATAT2 nucleic acid sequence is predicted to encode a 312 amino acid protein 

2 5 (SEQ ID NO:4), with a molecular weight of 34.6 kD, and a pi of 9.99. The ATAT2 protein 

may also contain 2 to 3 transmembrane domains. However, the protein encoded by the 
ATAT2 nucleic acid sequence may be longer than predicted because of the absence of an 
inframe stop codon upstream of the ATG start codon used. 

The ATAT3 nucleic acid sequence is predicted to encode a 398 amino acid protein 

3 0 (SEQ ID NO:6), with a molecular weight of 44.7 kD, and a pi of 5.62. The ATAT3 protein 

may contain 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted 
to encode a 317 amino acid protein (SEQ ID NO:8), with a molecular weight of 36.5 kD, and 
a pi of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. 
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The ATLPAAT1 nucleic acid sequence is predicted to encode a 389 amino acid 
protein (SEQ ID NO:23), with a molecular weight of 43.7 kD, and a pi of 9.52. The 
ATLPAAT1 protein is predicted to have up to 3 transmembrane domains. The protein 
predicted from the ATLPAAT1 nucleic acid sequence is similar toLPAATs reported for 
Brassica, maize, and meadowfoam (described in PCT Publication WO 94/13814). The 
AT ATI 1 nucleic acid sequence is predicted to encode a 375 amino acid protein (SEQ ID 
NO:21), with a molecular weight of 43.5 kD, and a pi of 9.45. The deduced amino acid 
sequences of ATAT6 (SEQ ID NO:l 1), ATAT7 (SEQ ID NO:13), ATAT8 (SEQ ID NO: 15), 
ATAT9 (SEQ ID NO: 17), and AT AT 10 (SEQ ID NO: 19) are also provided 

A sequence region approximately 30 amino acids upstream through approximately 
100 amino acids downstream of the conserved amino acid sequences HXXXXD (Heath and 
Rock, (1998) 7. Bacteriol. 180(6): 1425-1430) and PEG (Neuwald (1997) Curr Biol 7:R465- 
R466) of the predicted amino acid sequences derived from the nucleic acid sequences of 
AT ATI, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8, ATAT9, AT AT 10, 
ATLPAAT1, and AT ATI 1 are compared to the amino acid sequences of lysophosphatidic 
acid acyltransferase (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in 
SEQ ID NO:161), maize AT (PCT Publication WO 94/13814), PLSC coco(GenBank 
accession 1098605), PLSC Lim(GenBank accession 1209507), PLSCEcoli (GenBank 
accession 1209507), and PLSC Yeast(GenBank accession 464422)) and glycerol-3-phosphate 
acyltransferase (PLSB Ecoli(GenBank accession 130326) and PLSB Mouse(GenBank 
accession 2498786)) (Figure 2), and similarities are identified (Figure 2 and Figure 3). 

Sequence comparisons reveal several classes of acyltransferases exist based on 
conserved amino acid sequences identified in the comparisons in Figure 2. For example, 
ATAT1, ATAT6, ATAT7, ATAT8, and ATAT9, contain the conserved amino acid 
sequences of VTYSXS(SEQ ID NO: 128), VXLTRXR(SEQ ID NO: 129), LXXGDLV(SEQ 
ID NO: 132) between the HXXXXD and PEG sequences. In addition, AT ATI, ATAT6, 
ATAT7, ATAT8, and ATAT9 also contain the conserved sequences CPEGT(SEQ ID NO: 
130) which comprises the PEG sequence, as well as IVPVA(SEQ ID NO: 131) and 
VANXXQ (SEQ ID NO: 134)(Figure 2) downstream of the PEG sequence. The sequences 
corresponding to AT ATI, ATAT7, and ATAT9 are the most closely related in this class, with 
similarities between AT ATI and ATAT9 of 67.0%, between ATAT1 and ATAT7 of 58.2% 
and between ATAT9 and ATAT7 of 63.9% (Figure 3B). 
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Sequence comparisons also demonstrate that the sequence of ATLPAAT1 is most 
closely related to the jojoba LPAAT (82.3% similar), and maize (78.0% similar). 

Furthermore, sequence analysis demonstrates that ATAT4 is the most divergent 
sequence with the highest similarity to AT AT 10 (18.5%). The highest similarity (15.3%) to a 
5 known sequence is with a meadowfoam (Limnanthes douglassi) LPAAT. However, the 

sequences of ATAT4 and AT AT 10 share several conserved peptide sequences with the amino 
acid sequences of ATAT2 and ATAT3 (Figure 2), VXNHXS (SEQ ID NO: 127) where the H 
comprises the conserved H of the HXXXXD sequence and FXXGAF (SEQ ID NO: 133) 
downstream of the PEG sequence. 

10 

Example 6: Identification of Additional Acyltransferase Sequences 

The novel Arabidopsis sequences identified above are used to search proprietary 

15 databases containing soybean and corn EST sequences. The results of this search identifies 
EST sequences from soybean (SEQ ID NO:24 through SEQ ID NO: 85) as well as from corn 
(SEQ ID NO: 86 through SEQ ID NO: 126) as encoding acyltransferase related proteins. 

Sequence comparisons between the various EST sequences and the complete 
Arabidopsis sequences reveals that the identified EST sequences demonstrate higher 

2 0 similarity to the various Arabidopsis sequences as determined by BLAST scores. 

Expressed Sequence Tag (EST) sequences from soybean and corn databases are 
identified which are most closely related by BLAST score to AT ATI (SEQ ID NOS:24-29 
and SEQ ID NOS:86-88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO:89, 
respectively), ATAT3 (SEQ ID NOS:31-35 and SEQ ID NOS:90-94, respectively), ATAT4 

25 (SEQ ID NOS:36-44 and SEQ ID NOS:95-100, respectively), ATAT6 (SEQ ID NOS:45-49 
and SEQ ID NO:101, respectively), ATAT7 (SEQ ID NOS:50-54 and SEQ ID NOS:102-103, 
respectively), ATAT8 (SEQ ID NOS:55-56 and SEQ ID NO: 104, respectively), ATAT9 
(SEQ ID NOS:57-79 and SEQ ID NOS:105-1 1 1, respectively), AT AT 10 (SEQ ID NOS:80- 
81 and SEQ ID NO: 1 12, respectively), AT ATI 1, (SEQ ID NOS: 82-85 and SEQ ID 

30 NOS:123-126, respectively), and ATLPAAT1 (SEQ ID NOS: 1 13-122 respectively). 
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Example 7: Expression Construct Preparation 



A series of synthetic oligo nucleotide primers were prepared for use in Polymerase 
Chain Reactions (PCR) to amplify the entire DNA sequences encoding the various 
acyltransferase sequences identified above. The sequences are listed in Table 3. 



Table 3 



Primer Sequence (listed 5'-3') 



SEQ ID 
NO: 



AT AT IF AAGCTTGCATGCGTCGACACAATGGTTCATGCGACCAAGT 163 
CAG 

ATAT1R GGTACCGTCGACTC ACTTCTTGGTGTTGTTGATAG 164 

ATAT2F GGATCCGCGGCCGCACAATGACGAGCTTTACTACTTCCCT 165 
TCAT 

ATAT2 R GGATCCCCTGCAGGTTAGAGATCCATTGATTCTGCAAT 166 

ATAT3 F GGATCCGCGGCCGCATAATGGAATCAGAGCTCAAAGAT 167 

AT AT 3 R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 

AT AT 4 F GGATCCGCGGCCGC ACAATGACTCGTTC ACAAGATGTTTC 169 

A .. 

AT AT 4 R GGATCCCCTGCAGGTCACTTCTCTTCCAATCTAGCCAG 17 0 

AT AT 6 F GGATCCGCGGCCGCACAATGTCCGGTAATAAGATCTCGAC 171 
TCTTCA 

AT AT 6 R GGATCCCCTGCAGGTTATTTTTTCTTGACAACTCCGTTAT 172 
TACCGG 

ATAT7 F ATATCCGCGGCCGCACAATGGTTATGGAGCAAGCTGGAA 173 

ATAT7R GGATCCCCTGCAGGTCAATGGAGACAAGGCTCGAAAGT 174 

AT AT 8 F GGATCCGCGGCCGCACAATGTCCGCCAAGATTTCAATATT 175 
CC 

AT AT 8 R GGATCCCCTGCAGGTTAATTTTTCTTAACTACTCCATT 17 6 

AT AT 9 F GGATCCGCGGCCGCAC AATGGGAGCTC AGGAGAAACGGCG 177 
CC 

AT AT 9 R GGATCCCCTGCAGGTCACGTCTTCTCCTTCTTCACCGG 178 

ATAT1 OF GGATCCGCGGCCGCACAATGGCGGATCCTGATCTGTCTTC 179 
TCCT 

ATAT1 OR GGATCCCCTGCAGGTTATGTTGGGGCCAAGTCAGGTGCAA 180 
AGAT 

AT ATI IF GGATCCGCGGCCGC AAAATGGAAAAAAAGAGTGTACC AAA 181 
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TTCT 




ATAT11R 


GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 


182 






TTTTTG 




ATLPAAT 


TCGACCTGCAGGAAGCTTAAGGATGGTGATTGCTGC 


183 


IF 

ATLPAAT 


GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 


184 


1R 

YSCAT1F 


GGATCCGCGGCCGCACAATGTCTTTTAGGGATGTCCTAG 


185 


YSCAT1R 


GGATCCCCTGCAGGTCAATCATCCTTACCCTTTGGTTTAC 


186 


YSCAT 


1 


C 

ATGTCTTTTAGGGATGTCCTAGAAAGAGGAGATGAATTTT 


187 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


1 


TCAATCATCCTTACCCTTTGGTTTACCCTCTGGAGGCAGA 


188 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT2F 


GGATCCGCGGCCGCACAATGAAGCATTCCCAAAAATACCG 


189 






TAGG 




YSCAT2R 


GGATCCCCTGCAGGTCAATGATTTTTTTTCATCACAAATA 


190 


YSCAT 


2 


C 

ATGAAGCATTCCCAAAAATACCGTAGGTATGGAATTTATG 


191 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


2 


TCAATGATTTTTTTTCATCACAAATACAAGAATAAGAAAA 


192 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGGGTTTTGTTGATTTCTTCGA 


193 


3F 




AAC 




YSCAT 




GGATCCCCTGCAGGTTATTTGGTCTCAATTTTAATATTTT 


194 


3R 




TTTGC 




YSCAT 


3 


ATGGGTTTTGTTGATTTCTTCGAAACATATATGGTCGGTT 


195 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


3 


TTATTTGGTCTCAATTTTAATATTTTTTTGCAAGGACTCG 


196 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGGAAAAGTACACCAATTGGAG 


197 


4F 




AGAC 




YSCAT 




GGATCCCCTGCAGGCTACTTCCTCTTTTTACGTTGATCGC 


198 


4R 




TG 




YSCAT 


4 




199 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


4 


CTACTTCCTCTTTTTACGTTGATCGCTGATATATTCCTTC 


200 


KO R 




AGATTGTACTGAGAGTGCAC 
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YSCAT 




GGATCCGCGGCCGCAC AA 1 LjCC 1 bLALLAAAAL 1 L-AC<jVjA 


^2 U ± 


or 




G 




v c*/** a m 
YbtAl 




bbA 1 CCCC 1 CjL AvjVjC 1 ALVjLA 1L1LL1 I L- I 1 i LLL 1 


9 n 9 


DK 








vr OP 1 A rn 

YoLAT 


tr 
D 


A mr - '* pprpn /~i a P'P 1 A A A A pfTP T^C^C^C* A P A A 2i rpr^rp/^^^rp^rnrp/^/^' a 


iii \J ~j 


J\vJ r 










o 


pmA ^P^P^ A rnr^mr^r^mrn/^rnmrn^^^m 




J\vJ K 




A pi AmmpmApmp 7\p a P' ,^ T" , P' , P >, A P* 
A\j A 1 1 1 AL- 1 bj Avj AVj 1 b*L. AL. 










OAR i 


or 




m a a p-'P" 1 P" 1 
lAAbbb 




YSCAT 




/-t/-t tv mr^/~*/^t/^rn/*^^ A 0/*^T>0 A mrn/^fTvni'nP i rprni 1 "1 'P^P 1 rnr^rnm/^rn/^t i tr i im 

GGATCCCC I bbAbb 1 LA lltlllLllll Cvj I br I ILlbi 1 1 


9 fi ^ 


DK 




TCTG 




YSCAT 


6 


ATGTCTGCTCCCGCTGCCGATCATAACGCTGCCAAACCTA 


207 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


D 


m/~i 7v mm<~' mmm/^' mmrnrno z" 1 m/'"" 1 rnrTi/^rnr^rnrnrnrnornP^rnP^rprp A p-'P - ' A pip* 

TCATTCTTTLT III Lb I bl I L 1 L I 1 1 ILIblLl I ACCAljC 


9 Pi Q 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




f—\ •» j-r-i . y^i y-. y^i f—\ TV TV Tk m/"""l /"»m/""l TV fTl/"*t TV TV TV TV TV TV m TV m/"""1 TV 

GGATCCGCGGCCGCACAATGCTGCATCAAAAAATAGCTCA 


z uy 


7F 




m tv tv tv rimmnn 

TAAAGTTCG 




YSCAT 




tv m mr"*! /~i tv /~i /~i m tv tv tv "a 7v °a m tv tv tv tv /"""• a a m AAA z"" 1 rnrr^rTi a rn 

GG ATC C C C TGC AGGTC AAAAAAT AAAAC AAT AAAGTTT AT 




•"7 T5 
/R 




TV TV TV r~i m TV TV /■t/" - ^ 

AAAC T AAC C 




YSCAT 


7 


tv mri / r rn/~'' tv mr^ *a tv tv tv "a tv m a m/^» tv m t\ tv tv /^»rnrn/** , /^ < A A A A Tl/~i/^ 

ATGCTGCATCAAAAAATAGCTCATAAAbr'l ILbAAAAblLb 


9 1 1 


KO F 




CTGTGCGGTATTTCACACCG 




-\r TV m 

YSCAT 


-7 
/ 


mr"* tv tv tv tv tv tv m tv tv tv 7v /^*» tv tv fn 7v 7v A mmrn a rp AAA pmA A f** AAA rnrn 

TC AAAAAAT AAAAC AAT AAAbl II AI AAAC 1 AAC C AAA 1 I 


9 19 


KU R 




7v 7v mmrimA r^rn/^ 1 a a pmp O A z" 1 
AbATTbTAL TbAbAb I bLAL 




YSCAT 




c~*/~* 7\ mr**ir*r*f*f~ , *r>r*r**t~*r~* tv /~» a a m/™' a pnnrimri a rp A O O rn a /^/^rprn/^rnrn 

GGATCCGCGGCCGCACAAIbrAG lb IbAl AbrL? I ALvj I 1L1 1 


9 17 


or 








x oLAl 




PP a Tipppprnpp BPPTifPA A rpp P 1 a mpninirnfnmmj\ /~i A P 1 A rn/^2 A A P 1 
L?vjA I LLLL I LrL-Abibr I 1 AA I bL/\ 1L1 1 1 1 1 1 AL AvjiA 1 uAAL 


9 1 A 


OK 




C 




YSCAT 


8 


ATGAGTGTGATAGGTAGGTTCTTGTATTACTTGAGGTCCG 


215 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


8 


TTAATGCATCTTTTTTACAGATGAACCTTCGTTATGGGTA 


216 


KO R 




AGATTGTACTGAGAGTGCAC 





The entire coding regions for each of the acyltransferase sequences were amplified 
using the respective primers listed in the Table 3 above, cloned into the vector pCR2.1Topo 
(Invitrogen) orpZero (Invitrogen), and labeled as pCGN8558 (ATAT1), pCGN8564 
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(ATAT2), pCGB8565 (ATAT3), pCGN8566 (ATAT4), pCGN8918 (ATAT6), 
pCGN8913 (ATAT7), pCGN8904 (ATAT8), pCGN9970 (ATAT9), pCGN9940 
(ATAT10), pCGN8567 (ATAT1 1), pCGN8632 (ATLPAAT1), pCGN9901 (YSCAT1 
also referred to as gj21 32299), pCGN9902 (YSCAT2, also referred to as gi 1078509), 
pCGN9903 (YSCAT3, also referred to as gi2 132939), pCGN9904 (YSCAT4, also 
referred to gi2 13303 1), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 
(YSCAT6, also referred to as gi549627), pCGN9907 (YSCAT7, also referred to as 
gi586485), and pCGN9908 (YSCAT8, also referred to as gi464422). The nucleic acid 
sequences for the respective yeast acyltransferase are provided YSCAT1 (SEQ ID 
NO:225), YSCAT2 (SEQ ID NO:226), YSCAT3 (SEQ ID NO:227), YSCAT4 (SEQ ID 
NO:228), YSCAT5 (SEQ ID NO:229), YSCAT6 (SEQ ID NO:230), YSCAT7 (SEQ ID 
NO:231), and YSCAT8 (SEQ ID NO:232). 
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7A. Baculovirus Expression Constructs 

Constructs are prepared to direct the expression of the A rabidopsis AT AT sequences 
in cultured insect cells. The entire coding regions of AT ATI, 2, 3, 4, 6, 7, 8, 9, 10, and 1 1 are 
cloned into the vector pFastBacl (Gibco-BRL, Gaithersburg, MD) digested with Notl and 
5 Pstl. The respective coding sequences were cloned as NotI/Sse&3$7I fragments. Double 
stranded DNA sequence was obtained to verify that no errors were introduced by PCR 
amplification. The resulting plasmid were designated pCGN9723 (ATAT1), pCGN9724 
, (ATAT2), pCGN9725 ( ATAT3 ), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 
\ (ATAT7), pCGN9729 (ATAT8), pCGN9991 (ATAT9) pCGN9730 (ATAT10), pCGN9731 
10 (ATAT11). 

7B. Plant Expression Construct Preparation 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
15 more useful for cloning large DNA fragments containing multiple restriction sites, and to 

allow the cloning of multiple napin fusion genes into plant binary transformation vectors. An 
adapter comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 
(SEQ ID NO:233) AT was ligated into the cloning vector pBC SK+ (Stratagene) after 
2 0 digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 
pCGN3223 and pCGN7765 were digested with Notl and ligated together. The resultant 
vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 
expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 

2 5 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 

replaced with the double CAMV 35S promoter and the tml polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 

3 0 polylinker of pCGN 1 558 was replaced as a HindIII/Asp7 1 8 fragment with apolylinker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI,and Notl. 
The Asp718 and Hindlll restriction endonuclease sites are retained in pCGN5139. 



WO 00/18889 35 PCT/US99/22231 

A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:234) and 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:235) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3' 
region was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 

10 had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 

15 The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' ) (SEQ ID NO:236) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:237) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3' 
region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 

2 0 ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 

2 5 and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' ) (SEQ ID NO:238) and 5'- 
CCTGC AGG A AGCTTGCGGCCGCGG ATCC-3 ' ) (SEQ ID NO:239) into Sall/Sacl- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 

3 0 was removed from pCGN8620 by complete digestion with Asp718I and partial digestion with 

Notl. The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
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oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 
5 The plasmid pCGN862 1 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' ) (SEQ ID NO:240) and 5'- 
GG ATCCGCGGCCGC A AGCTTCCTGCAGG-3 ' ) (SEQ ID NO:241) into Sall/SacI- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 
was removed from pCGN8621 by complete digestion with Asp718I and partial digestion with 

10 Notl. The fragment was blunt-ended by filling in the .5' overhangs withKlenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 

15 confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8625. 

The coding regions of the various acyltransferase sequences were cloned as 
NotVSse&3&7! fragments into pCGN8622, pCGN8623, pCGN8624, and pCGN8625, for 
expression in sense or antisense orientations from a tissue preferential promoter, napin, or the 

2 0 35S promoter. Fragments which were cloned into the pCGN8622 vector created the 

constructs pCGN8901 (ATAT1), pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 
(ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7), pCGN8905 (ATAT8), pCGN9973 
(ATAT9), pCGN9942 (ATAT10), pCGN8575 (AT ATI 1), and pCGN8633 (ATLPAAT1) for 
the sense expression of the respective coding sequences from the napin promoter. Fragments 

2 5 which were cloned into the pCGN8623 vector created the constructs pCGN8900 (ATAT1), 

pCGN8572 (ATAT2), pCGN8910 (ATAT3), pCGN8597 (ATAT4), pCGN8920 (ATAT6), 
pCGN8915 (ATAT7), pCGN8906 (ATAT8), pCGN9972 (ATAT9), pCGN9943 (ATAT10), 
pCGN8576 (ATAT1 1), and pCGN8634 (ATLPAAT1) for the antisense expression of the 
respective coding sequences from the napin promoter. Fragments which were cloned into the 

3 0 pCGN8624 vector created the constructs pCGN8903 (ATAT1), pCGN8573 (ATAT2), 

pCGN891 1 (ATAT3), pCGN8598 (ATAT4), pCGN8921 (ATAT6), pCGN8916 (ATAT7), 
pCGN8907 (ATAT8), pCGN9971 (ATAT9), pCGN9944 (AT AT 10), pCGN8577 (AT ATI 1), 
and.pCGN8635 (ATLPAAT1) for the sense expression of the respective coding sequences 



WO 00/18889 37 PCT/US99/22231 

from the 35S promoter. Fragments which were cloned into the pCGN8625 vector created the 
constructs pCGN8902 (ATAT1) and pCGN9974 (ATAT9) for the antisense expression of 
the respective coding sequences from the 35S promoter. 

In addition, the yeast acyltransferase coding sequences were cloned into the vector 
5 pCGN8624 creating the constructs pCGN9926 (YSCAT1), pCGN9927 (YSCAT2), 
pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 
(YSCAT6), pCGN9932 (YSCAT7), and pCGN9933 (YSCAT8). These constructs allow for 
, the sense expression of the respective acyltransferase coding sequences from the 35S 
promoter in plant cells. 

10 

Example 8: Plant Transformation 

A variety of methods have been developed to insert a DNA sequence of interest into the 
15 genome of a plant host to obtain the transcription or transcription and translation of the sequence 
to effect phenotypic changes. 

Transgenic Brassica plants are obtained by Agrobacterium-medialzd transformation 
as described by Radke et al. (Theor. Appl. Genet. (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic Arabidopsis thaliana plants may be obtained by 
2 0 Agrobacterium-mediattd transformation as described by Valverkens et al., (Proc. Nat. Acad. 
ScL (1988) 55:5536-5540), or as described by Bent et al. ((1994), Science 265:1856-1860), or 
Bechtold et al. ((1993), C.R.Acad.Sci, Life Sciences 316:1 194-1 199) or Clough, et al. (1998) 
Plant J., 16:735-43. Other plant species may be similarly transformed using related 
techniques. 

2 5 Alternatively, microprojectile bombardment methods, such as described by Klein et 

al. (Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

The above results demonstrate that the nucleic acid sequences identified encode 
proteins which are related to protein sequences encoding acyltransferase proteins. Such 

3 0 acyltransferase sequences find use in preparing expression constructs for plant 

transformations. 

All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 
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publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious that 
certain changes and modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

1 . An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
5 proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 127 
(VxNHxS) wherein the H is the conserved Histidine residue in the conserved peptide 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

10 2. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 

proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 128 
(VTYSxS) within about 30 amino acids downstream from the conserved amino acid sequence 
HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

15 

3. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 129 
(VxLTRxR) within about 60 amino acids downstream from the conserved amino acid 
2 0 sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

4. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 132 

2 5 (LxxGDLV) within about 20 amino acids upstream of the conserved amino acid sequence 

PEG of said acyltransferase-like protein, x representing any amino acid. 

5. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

3 0 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 130 (CPEGT) 

containing the conserved amino acid sequence PEG of said acyltransferase-like protein. 
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6. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 133 
(FxxGAF) within about 20 amino acids downstream from the conserved amino acid sequence 
PEG of said acyltransferase-like protein, x representing any amino acid. 

7. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 131 (IVPVA) 
within about 40 amino acids downstream from the conserved amino acid sequence PEG of 
said acyltransferase-like protein. 

8. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 134 
(VANxxQ) within about 1 10 amino acids downstream from the conserved amino acid 
sequence PEG of said acyltransferase-like protein, x representing any amino acid. 

9. A DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, 
said DNA sequence obtainable by the steps comprising: 

(a) using the profile of Figure 1 to search a nucleic acid sequence database; 

(b) obtaining a probability score for nucleic acid sequences in said sequence 
database using the Smith-Waterman algorithm; and 

( c ) selecting a nucleic acid sequence having a probability score of less than about 1. 

10. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an encoding sequence. 

11. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an EST. 
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12. The DNA encoding sequence according to any one of Claims 1 to 1 1, wherein 
said acyltransferase-like protein is from a plant 

13. A construct comprising a DNA sequence of any one of Claims 1 to 1 1 linked to a 
5 heterologous transcriptional and translational initiation region functional in a host cell. 

14. The construct according to Claim 13 wherein said host cell is a plant cell. 

15. A plant cell comprising a DNA construct according to Claim 13. 

10 

1 6. A plant comprising a cell according to Claim 15. 

17. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
15 like protein is from Arabidopsis thaliana. 

18. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from corn. 

2 0 19 . The DNA encoding sequence of Claim 18 wherein said sequence comprises and 

EST selected from the group consisting of SEQ ID NO: 86 through SEQ ID NO: 126. 

2 0 . The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from soybean. 

25 

21 . The DNA encoding sequence of Claim 20 wherein said sequence comprises and 
EST selected from the group consisting of SEQ ID NO: 24 through SEQ ID NO: 85. 

22 . The DNA encoding sequence of any one of Claims 2, 3, 4, 5, 7 and 8 wherein 

3 0 said acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 1 , SEQ 

ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID NO: 16. 
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23 . The DNA encoding sequence of either of Claim 1 and Claim 6 wherein said 
acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 7 and SEQ ID NO: 18. 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyltransf erases 

<130> 17029/00/WO 

<140> 
<141> 

<150> 60/101,939 

<151> 1998-09-25 

<160> 241 

<170> Patentln Ver . 2.0 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg 

gtcttccatg 

ctatggcttc 

ctgaaagatt 

atcgtcctcc 

ccgcgcttga 

acagtgtctc 

accgtgccac 

gtccggaagg 

agctaagcga 

ccacagttag 

gctatgaagc 

agactcctat 

aatgcaccga 

tggagtctat 



cgaccaagtc 
atgggcgttt 
cttttggttt 
tgtccgttac 
acctccttcc 
tcccatcatc 
tcgtctctcc 
cgatgctgcc 
cacgacgtgt 
ccggattgtg 
gggtgtgaag 
cactttcttg 
agaggtggct 
acttactcgc 
caacaacacc 



agccacaacg 
agcgcaacgt 
catctctcca 
acttacgaga 
cctggaactc 
gtcgctattg 
cttatgcttt 
aacatgagaa 
agagaagagt 
ccagtagcga 
ttttgggacc 
gatcgtttgc 
aattacgtcc 
aaggataaat 
aagaagtga 



attccaaaag 
ccaactccgt 
tcattcgcgt 
tgctcgggat 
ttggcaacct 
ctcttggacg 
ctcctattcc 
aacttctcga 
atctactgag 
tgaactgtaa 
cttacttctt 
ctgaagaaat 
agaaagttat 
atcttttgct 



aacgcttaaa 
taaacgccat 
ctacttcaac 
ccacttaacc 
ctatgtcctt 
taagatctgt 
tgctgttgcc 
gaaaggcgac 
atttagcgct 
acaaggaatg 
cttcatgaac 
gactgtcaac 
cggcgcggtt 
tggaggtaat 



gaaccgcata 
tatcacatac 
ctccctttac 
attcgtggtc 
aaccaccgta 
tgcgtcactt 
ctcacccgtg 
ttggtgatat 
ctattcgcag 
ttcaacggga 
ccaagaccaa 
ggtggtggca 
ttgggcttcg 
gacggcaagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

869 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 



<400> 2 
Met Val His 
1 

Lys Asn Arg 



Pro Leu Asn 
35 

Leu Ser lie 
50 

Val Arg Tyr 
65 

His Arg Pro 
Leu Asn His 



Ala Thr 
5 

lie Val 
20 

Ala lie 
lie Arg 
Thr Tyr 



Pro Pro 
85 

Arg Thr 
100 



Lys Ser 

Phe His 

lie Thr 

Val Tyr 
55 

Glu Met 
70 

Pro Ser 
Ala Leu 



Ala Thr Thr 
10 

Asp Gly Arg 
25 

Tyr Leu Trp 
40 

Phe Asn Leu 
Leu Gly He 



He Pro Lys Glu 



Pro Gly Thr 
90 

Asp Pro He 
105 



Leu Ala Gin Arg 
30 

Leu Pro Phe Gly 
45 

Pro Leu Pro Glu 
60 

His Leu Thr He 
75 

Leu Gly Asn Leu 



Arg Leu 
15 

Pro Thr 
Phe He 
Arg Phe 



He Val Ala He 
110 



Arg Gly 
80 

Tyr Val 
95 

Ala Leu 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyltransf erases 



<130> 17029/00/WO 



<140> 
<141> 



<150> 60/101,939 
<151> 1998-09-25 



<160> 241 



<17 0> Patent In Ver . 2.0 



<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg cgaccaagtc agccacaacg 
gtcttccatg atgggcgttt agcgcaacgt 
ctatggcttc cttttggttt catctctcca 
ctgaaagatt tgtccgttac acttacgaga 
atcgtcctcc acctccttcc cctggaactc 
ccgcgcttga tcccatcatc gtcgctattg 
acagtgtctc tcgtctctcc cttatgcttt 
accgtgccac cgatgctgcc aacatgagaa 
gtccggaagg cacgacgtgt agagaagagt 
agctaagcga ccggattgtg ccagtagcga 
ccacagttag gggtgtgaag ttttgggacc 
gctatgaagc cactttcttg gatcgtttgc 
agactcctat agaggtggct aattacgtcc 
aatgcaccga acttactcgc aaggataaat 
tggagtctat caacaacacc aagaagtga 



attccaaaag aacgcttaaa gaaccgcata 60 
ccaactccgt taaacgccat tatcacatac 120 
tcattcgcgt ctacttcaac ctccctttac 180 
tgctcgggat ccacttaacc attcgtggtc 240 
ttggcaacct ctatgtcctt aaccaccgta 300 
ctcttggacg taagatctgt tgcgtcactt 360 
ctcctattcc tgctgttgcc ctcacccgtg 420 
aacttctcga gaaaggcgac ttggtgatat 480 
atctactgag atttagcgct ctattcgcag 540 
tgaactgtaa acaaggaatg ttcaacggga 600 
cttacttctt cttcatgaac ccaagaccaa 660 
ctgaagaaat gactgtcaac ggtggtggca 720 
agaaagttat cggcgcggtt ttgggcttcg 780 
atcttttgct tggaggtaat gacggcaagg 840 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 

Mec°vax His Ala Thr Lys Ser Ala Thr Thr He Pro Lys Glu Arg Leu 
1 5 1° " 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 
20 25 

Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 
35 40 45 

Leu Ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 
50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 
65 70 75 

His Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 
85 90 

Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 
100 105 110 
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Gly Arg Lys lie Cys Cys Val Thr Tyr Ser Val 
115 120 

Met Leu Ser Pro lie Pro Ala Val Ala Leu Thr 
130 135 

Asp Ala Ala Asn Met Arg Lys Leu Leu Glu Lys 
145 150 155 

Cys Pro Glu Gly Thr Thr Cys Arg Glu Glu Tyr 
165 170 

Ala Leu Phe Ala Glu Leu Ser Asp Arg lie Val 
180 185 

Cys Lys Gin Gly Met Phe Asn Gly Thr Thr Val 
195 200 

Trp Asp Pro Tyr Phe Phe Phe Met Asn Pro Arg 
210 215 

Thr Phe Leu Asp Arg Leu Pro Glu Glu Met Thr 
225 230 235 

Lys Thr Pro lie Glu Val Ala Asn Tyr Val Gin 
245 250 

Val Leu Gly Phe Glu Cys Thr Glu Leu Thr Arg 
260 265 

Leu Leu Gly Gly Asn Asp Gly Lys Val Glu Ser 
275 280 



Ser Arg Leu Ser Leu 
125 

Arg Asp Arg Ala Thr 
140 

Gly Asp Leu Val lie 
160 

Leu Leu Arg Phe Ser 
175 

Pro Val Ala Met Asn 
190 

Arg Gly Val Lys Phe 
205 

Pro Ser Tyr Glu Ala 
220 

Val Asn Gly Gly Gly 
240 

Lys Val lie Gly Ala 
255 

Lys Asp Lys Tyr Leu 
270 

lie Asn Asn Thr Lys 
285 



Lys 



<210> 3 
<211> 939 
<212> DNA 

<213> Arabidopsis sp . 



<400> 3 

atgacgagct 

agacgtactg 

gataagaaat 

tcaggagctg 

ctcagaggga 

atgattattg 

ttcattgcta 

ggtttggaga 

tttctggata 

gggatattcg 

aagcggatgg 

aagggagcat 

tctttcaaga 

acgctaatgg 

aatgtgagag 

gaggccagaa 



ttactacttc 
gcattcaatg 
cacctagatc 
caacccctga 
tattcttttg 
ggcatccgtt 
aactttgggc 
atctgccatc 
tctacacact 
taattcccat 
acccaagaag 
ctgtgttttt 
aaggcgcatrt 
gaacaggcaa 
ttatcatcca 
gcaagattgc 



ccttcatgct 
gtctaaccgc 
aagtcaattg 
ctcttctttt 
tgttgttgct 
cgtccttctc 
ttccataagc 
atcagacact 
tcttagtctt 
catcggttgg 
ccaagtggat 
cttcccagaa 
tacagtggct 
aatcatgcca 
taaaccaata 
agaatcaatg 



gtcccgagtg 
tctttaagac 
gcaagagata 
cctgaaccag 
ggcatttcgg 
ttcgatccct 
atttatccgt 
cctgctgtat 
ggaaaaagct 
gccatgtcca 
tgcttaaaac 
ggaacacgga 
gcgaagaccg 
acgggtagtg 
catggaagca 
gatctctaa 



aaaaatttat 
atgatcctta 
tcactgtgag 
agattaagtt 
ctacttttct 
ataggagaaa 
tttacaaaat 
atgtttcaaa 
ttaagttcat 
tgatgggtgt 
gctgcatgga 
gtaaggatgg 
gagttgcagt 
aaggtatact 
aagcggatgt 



gggcgaaaca 
cagatttctt 
agcagatctt 
gagctcaaga 
cattgtcctg 
attccaccac 
caacatcgag 
ccaccaaagt 
cagcaagaca 
cgttcccttg 
acttttaaag 
tcggttaggt 
agttccaata 
gaaccatggg 
tctttgcaac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

939 



<210> 4 
<211> 312 
<212> PRT 

<213> Arabidopsis sp. 
<400> 4 

Met Thr Ser Phe Thr Thr Ser Leu His Ala Val 
15 10 



Pro Ser Glu Lys Phe 
15 



Met Gly Glu Thr Arg Arg Thr Gly lie Gin Trp Ser Asn Arg Ser Leu 
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20 25 30 

Arg His Asp Pro Tyr Arg Phe Leu Asp Lys Lys Ser Pro Arg Ser Ser 
35 40 45 

Gin Leu Ala Arg Asp lie Thr Val Arg Ala Asp Leu Ser Gly Ala Ala 
50 55 60 

Thr Pro Asp Ser Ser Phe Pro Glu Pro Glu lie Lys Leu Ser Ser Arg 
65 70 75 80 

Leu Arg Gly lie Phe Phe Cys Val Val Ala Gly lie Ser Ala Thr Phe 
85 90 95 

Leu lie Val Leu Met lie lie Gly His Pro Phe Val Leu Leu Phe Asp 
100 105 110 

Pro Tyr Arg Arg Lys Phe His His Phe lie Ala Lys Leu Trp Ala Ser 
115 120 125 

lie Ser lie Tyr Pro Phe Tyr Lys lie Asn lie Glu Gly Leu Glu Asn 
130 135 140 

Leu Pro Ser Ser Asp Thr Pro Ala Val Tyr Val Ser Asn His Gin Ser 
145 150 155 . 160 

Phe Leu Asp lie Tyr Thr Leu Leu Ser Leu Gly Lys Ser Phe Lys Phe 
165 170 175 

lie Ser Lys Thr Gly lie Phe Val lie Pro lie lie Gly Trp Ala Met 
180 185 190 

Ser Met Met Gly Val Val Pro Leu Lys Arg Met Asp Pro Arg Ser Gin 
195 200 205 

Val Asp Cys Leu Lys Arg Cys Met Glu Leu Leu Lys Lys Gly Ala Ser 
210 " 215 220 

Val Phe Phe Phe Pro Glu Gly Thr Arg Ser Lys Asp Gly Arg Leu Gly 
225 230 235 240 

Ser Phe Lys Lys Gly Ala Phe Thr Val Ala Ala Lys Thr Gly Val Ala 
245 250 255 

Val Val Pro lie Thr Leu Met Gly Thr Gly Lys lie Met Pro Thr Gly 
260 265 270 

Ser Glu Gly lie Leu Asn His Gly Asn Val Arg Val lie He His Lys 
275 280 285 

Pro He His Gly Ser Lys Ala Asp Val Leu Cys Asn Glu Ala Arg Ser 
290 295 300 

Lys lie Ala Glu Ser Met Asp Leu 
305 310 

<210> 5 
<211> 1197 
<212> DNA 

<213> Arabidopsis sp. 
<400> 5 

atggaatcag agctcaaaga tttgaattcg aattcgaatc ctccgtcgag caaagaggac 60 

cggccgttac tgaaatcaga atccgatttg gcggctgcca ttgaagagtt agacaaaaag 120 

ttcgcacctt acgcgaggac cgatttgtat gggacgatgg gtttgggtcc tttcccgatg 180 

acggagaata ttaaattggc ggttgcattg gtgactcttg ttccattgcg gtttcttctc 240 

tcgatgagca tcttgcttct ctattacttg atttgtaggg tatttacgct gttttctgct 300 

ccttatcgtg ggccagagga agaggaagat gaaggtggag ttgtttttca ggaagattat 3 60 

gctcacatgg aaggttggaa acggactgtt atcgtccggt ctgggaggtt tctctctagg 420 
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gttttgcttt tcgtttttgg gttttattgg attcacgaga gctgtccaga tcgagattca 480 

gacatggatt ctaatcctaa aactacttct acagagatta accagaaagg ggaagccgcc 540 

acggaggaac ctgaaagacc tggagccatt gtgtccaatc atgtttcgta cttggacatt 600 

ttgtatcata tgtctgcttc ttttccaagt tttgttgcca agagatcagt gggcaaactt 660 

cctcttgttg gcctcattag caaatgcctt ggttgtgtct atgttcaaag agaagcaaaa 720 

tcgcctgatt tcaagggtgt atctggcaca gtaaatgaaa gagttcgaga agctcatagc 780 

aataaatctg ctccaactat tatgcttttt ccagaaggaa caactaccaa tggagactac 840 

ttacttacat tcaagacagg tgcatttttg gctggaactc cagttcttcc ggtaatatta 900 

aaatatccgt atgagcgctt cagtgtggca tgggatacca tatccggggc acgccacatt 960 

ttattccttc tctgtcaagt cgtaaatcac ttggaagtca tacggttacc tgtatactac 
1020 

ccatcccaag aagagaaaga cgatcccaaa ctttatgcta gcaatgttcg gaaattaatg 
1080 

gccaccgagg gtaacttgat tctatcggag ttgggactta gcgacaaaag gatatatcac 
1140 

gcaactctca atggtaatct tagtcaaacc cgtgatttcc atcagaaaga agaatga 
1197 

<210> 6 
<211> 398 
<212> PRT 

<213> Arabidopsis sp. 
<400> 6 

Met Glu Ser Glu Leu Lys Asp Leu Asn Ser Asn Ser Asn Pro Pro Ser 
15 10 15 

Ser Lys Glu Asp Arg Pro Leu Leu Lys Ser Glu Ser Asp Leu Ala Ala 
20 25 30 

Ala lie Glu Glu Leu Asp Lys Lys Phe Ala Pro Tyr Ala Arg Thr Asp 
35 40 45 

Leu Tyr Gly Thr Met Gly Leu Gly Pro Phe Pro Met Thr Glu Asn lie 
50 55 60 

Lys Leu Ala Val Ala Leu Val Thr Leu Val Pro Leu Arg Phe Leu Leu 
65 70 75 80 

Ser Met Ser lie Leu Leu Leu Tyr Tyr Leu lie Cys Arg Val Phe Thr 
85 90 95 

Leu Phe Ser Ala Pro Tyr Arg Gly Pro Glu Glu Glu Glu Asp Glu Gly 
100 105 110 

Gly Val Val Phe Gin Glu Asp Tyr Ala His Met Glu Gly Trp Lys Arg 
115 120 125 

Thr Val lie Val Arg Ser Gly Arg Phe Leu Ser Arg Val Leu Leu Phe 
130 135 140 

Val Phe Gly Phe Tyr Trp lie His Glu Ser Cys Pro Asp Arg Asp Ser 
145 150 155 160 

Asp Met Asp Ser Asn Pro Lys Thr Thr Ser Thr Glu lie Asn Gin Lys 
165 170 175 

Gly Glu Ala Ala Thr Glu Glu Pro Glu Arg Pro Gly Ala lie Val Ser 
180 185 190 

Asn His Val Ser Tyr Leu Asp lie Leu Tyr His Met Ser Ala Ser Phe 
195 200 205 

Pro Ser Phe Val Ala Lys Arg Ser Val Gly Lys Leu Pro Leu Val Gly 
210 215 220 

Leu lie Ser Lys Cys Leu Gly Cys Val Tyr Val Gin Arg Glu Ala Lys 
225 230 235 240 

Ser Pro Asp Phe Lys Gly Val Ser Gly Thr Val Asn Glu Arg Val Arg 
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Glu Ala 

Gly Thr 

Phe Leu 
290 

Glu Arg 
305 

Leu Phe 
Pro Val 
Ala Ser 



Ser Glu 
370 

Gly Asn 
385 



245 

His Ser Asn 
260 

Thr Thr Asn 
275 

Ala Gly Thr 



Phe Ser Val 



Leu Leu Cys 
325 

Tyr Tyr Pro 
340 

Asn Val Arg 
355 

Leu Gly Leu 



Leu Ser Gin 



Lys Ser 

Gly Asp 

Pro Val 
295 

Ala Trp 
310 

Gin Val 
Ser Gin 
Lys Leu 



Ser Asp 
375 

Thr Arg 
390 



Ala Pro 
265 



250 

Thr lie Met Leu 



Leu Thr 
Leu Pro Val lie 



Tyr Leu 
280 



Asp Thr 

Val Asn 

Glu Glu 
345 

Met Ala 
360 

Lys Arg 
Asp Phe 



lie Ser 
315 

His Leu 
330 



Phe Lys 
285 

Leu Lys 
300 

Gly Ala 
Glu Val 



Lys Asp Asp Pro 
Thr Glu 
lie Tyr 



Gly Asn 
365 



His Gin 
395 



His Ala 
380 

Lys Glu 



255 

Phe Pro Glu 
270 

Thr Gly Ala 

Tyr Pro Tyr 

Arg His lie 
320 

lie Arg Leu 
335 

Lys Leu Tyr 
350 

Leu lie Leu 
Thr Leu Asn 
Glu 



<210> 7 

<211> 1131 

<212> DNA 

<213> Arabidopsis sp . 



<400> 7 

atgagcagta 

aacatcgaag 

ctgcgtgatt 

gactcgttca 

ttattcccac 

tgcttcactt 

ttgctgaaag 

tgcagctttt 

atccgtccta 

gagcagatga 

caaagcacaa 

cgtgaaattg 

ctcatatttc 

gcttttgaat 

gacgccttct 

tcatgggctg 

acaggaattg 

1020 

aaggtccctt 
1080 

aagcaacaga 
1131 



cggcagggag 
attaccttcc 
tgctagacat 
caagatgttt 
tatactgctt 
tagcttttgg 
gtcaagatag 
ttgtcgcctc 
agcaggtcta 
ccgcatttgc 
tattagagag 
tagcaaaaaa 
ccgaagggac 
tggactgcac 
ggaatagcag 
ttgtatgtga 
aatttgcaga 



gctcgtgact 
ttctggttct 
ctctccaacg 
caaatcaaat 
tggggttgtt 
gtggattatt 
gttgaggaaa 
atggaccgga 
tgttgccaac 
tgttataatg 
tgtgggatgt 
gttaagggac 
atgtgtaaat 
tgtttgtcca 
aaaacaatca 
agtgtggtac 
gagggtcaga 



tcaaaatccg 
tccatcaatg 
ctcactgaag 
cctccagaac 
gttagatact 
ttcctttcat 
aagatagaga 
gttgtcaaat 
catacttcaa 
cagaagcatc 
atctggttca 
catgtccaag 
aataattaca 
attgcaatta 
tttactatgc 
ttggaaccac 
gacatgatat 



agcttgacct 
aacctcgcgg 
ctgctggtgc 
cttggaactg 
gtatcctctt 
tgtttatccc 
gggtcttggt 
atcacgggcc 
tgattgattt 
ctggttgggt 
atcgttcaga 
gagctgacag 
cagtgatgtt 
aatacaacaa 
acttgctgca 
aaaccataag 
ctcttcgggc 



cgatcaccct 
caagctcagc 
cattgttgat 
gaatatttac 
tcccttgagg 
tgtaaatgcg 
ggaaatgatt 
acgtcctagc 
catcgtattg 
tggtcttctg 
ggcaaaggat 
taatcctctt 
taagaagggt 
gatttttgtt 
actcatgaca 
gcccggtgaa 
gggtctcaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



gggatggata cttgaagtat 
gtttcgcaga gtcgatcctg 



tcgagaccaa gccccaagca tagtgaacgc 
gctagattgg aagagaagtg a 



<210> 8 
<211> 376 
<212> PRT 

<213> Arabidopsis sp. 
<400> 8 

Met Ser Ser Thr Ala Gly Arg Leu Val Thr Ser Lys Ser Glu Leu Asp 
1 5 10 15 

Leu Asp His Pro Asn lie Glu Asp Tyr Leu Pro Ser Gly Ser Ser lie 
20 25 30 
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Asn Glu Pro Arg Gly Lys Leu Ser Leu Arg Asp Leu Leu Asp lie Ser 
35 40 45 

Pro Thr Leu Thr Glu Ala Ala Gly Ala lie Val Asp Asp Ser Phe Thr 
50 55 60 

Arg Cys Phe Lys Ser Asn Pro Pro Glu Pro Trp Asn Trp Asn He Tyr 
65 70 75 80 

Leu Phe Pro Leu Tyr Cys Phe Gly Val Val Val Arg Tyr Cys He Leu 
85 90 95 

Phe Pro Leu Arg Cys Phe Thr Leu Ala Phe Gly Trp He He Phe Leu 
100 105 110 

Ser Leu Phe He Pro Val Asn Ala Leu Leu Lys Gly Gin Asp Arg Leu 
115 120 125 

Arg Lys Lys He Glu Arg Val Leu Val Glu Met He Cys Ser Phe Phe 
130 135 140 

Val Ala Ser Trp Thr Gly Val Val Lys Tyr His Gly Pro Arg Pro Ser 
145 150 155 160 

He Arg Pro Lys Gin Val Tyr Val Ala Asn His Thr Ser Met He Asp 
165 170 175 

Phe He Val Leu Glu Gin Met Thr Ala Phe Ala Val He Met Gin Lys 
180 185 190 

His Pro Gly Trp Val Gly Leu Leu Gin Ser Thr He Leu Glu Ser Val 
195 200 205 

Gly Cys He Trp Phe Asn Arg Ser Glu Ala Lys Asp Arg Glu He Val 
210 215 220 

Ala Lys Lys Leu Arg Asp His Val Gin Gly Ala Asp Ser Asn Pro Leu 
225 230 235 240 

Leu He Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 
245 250 255 

Phe Lys Lys Gly Ala Phe Glu Leu Asp Cys Thr Val Cys Pro He Ala 
260 265 270 

He Lys Tyr Asn Lys He Phe Val Asp Ala Phe Trp Asn Ser Arg Lys 
275 280 285 

Gin Ser Phe Thr Met His Leu Leu Gin Leu Met Thr Ser Trp Ala Val 
290 295 300 

Val Cys Glu Val Trp Tyr Leu Glu Pro Gin Thr He Arg Pro Gly Glu 
305 " 310 315 320 

Thr Gly He Glu Phe Ala Glu Arg Val Arg Asp Met He Ser Leu Arg 
325 330 335 

Ala Gly Leu Lys Lys Val Pro Trp Asp Gly Tyr Leu Lys Tyr Ser Arg 
340 345 350 

Pro Ser Pro Lys His Ser Glu Arg Lys Gin Gin Ser Phe Ala Glu Ser 
355 360 365 

He Leu Ala Arg Leu Glu Glu Lys 
370 375 



<210> 9 
<211> 965 



WO 00/18889 



PCT/US99/22231 



<212> DNA 

<213> Arabidopsis sp. 



<400> 9 

gttgttaagt 

tcgatcacag 

tgggatcatc 

tctaatggta 

gccatggctc 

cgacccattc 

aagaaagtgc 

aggagggaat 

tctatgtgta 

agagaccgag 

gatttaggtt 

tttcagatat 

tagtagtagg 

gatgtaaata 

taaatttgta 

ctatggaat t 

aaaaa 



tacaagtctc 
ctcgattttc 
aaactngtcg 
ccgtcgtgat 
gtcaattcca 
tccgttcttg 
ggttcgcgga 
tgaaccggaa 
gaatctctac 
atcacagagt 
ttgtaaatct 
tgtagacttt 
tggttttctt 
attgacatgt 
aaaacatagt 
tatattgatt 



ttcaaaaaca 
ctttattgtt 
gtaaggwaac 
cgcaaccgcc 
tggaaatcat 
tctatcttca 
taatgtgaaa 
aagcgtaccg 
catgccagcg 
tcaatattct 
ttcttttgtt 
gtagttgggt 
atgctccact 
aagtagtcat 
gtgcctattg 
gtgttgaaaa 



cacacacacg 
ccgttggttt 
ttcacggacg 
atggtttgct 
caaaatccta 
gaggaaacga 
gatacgaaag 
aagccagtga 
aaccggatgg 
tattgacttt 
tttcggtaat 
ggtcttcttt 
tatctactta 
tagaaatttg 
tacatataaa 
aacaaaaaaa 



tctctcttca 
tcttgagnat 
gatcttcaat 
caagcaccgc 
aggttcttga 
agaaacaggg 
gtaacgggga 
ctaaaccggg 
ctctgtacaa 
ttcttcttga 
attagatttt 
ttctcccttt 
cttgttttaa 
aaaaggcaaa 
ctctcttttg 
aaaaaaaaaa 



cagccaatca 
ttttctttct 
gttgagctgt 
tctgtttctc 
tcagactcta 
gaagaagata 
agagtaccgg 
aaagaccggt 
tgggattctt 
ttagtcaata 
ttcttggaaa 
ttgtgtctca 
atcaagtgat 
tgaaagaata 
ttggggatat 
aaaaaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

72 0 

780 

840 

900 

960 

965 



<210> 10 
<211> 1593 
<212> DNA 

<213> Arabidopsis sp . 



<400> 10 

atgtccggta 

attctccgtc 

ggcctccacc 

ctactcaaat 

gtgataaggt 

atgggcttga 

gtggggaaat 

gttttgaaaa 
gtattcttgc 
ggtggttact 
gtggttcaag 
tcgccaagtc 
tcagacaaga 
cacgatggtc 
gccccattcg 
tccctagcca 
cacaacgacc 
1020 

acgttattgg 
1080 

acgtatagtc 
1140 

cgtgatcgag 
1200 

gtttgtccgg 
1260 

tctgaggttt 
1320 

ggcacgacgg 
1380 

ccttcctaca 
1440 

ggagtccctg 
1500 

gggaatgcct 
1560 

gccggtaata 
1593 



ataagatctc 
gttggtgtca 
aatatcaaga 
caaactcttt 
cacttttcct 
agacgatggt 
cagttttgcc 
gaggaggcaa 
gagattactt 
acctaggcat 
aagaaagact 
acagatctct 
aaagttggca 
gtttagccgt 
ccgccgtctt 
atcccttcct 
taatatccgc 



gactcttcaa 
tcgtagccct 
cctatcgaat 
attcccttac 
cttagttctt 
gatgctgagc 
taagtatttt 
gagagttgct 
ggagatagaa 
cgtggaggat 
tggtagtggt 
cttctctcaa 
aaccctacca 
taagccaaca 
agccgctgca 
cgccttttcc 
cgacagaaaa 



acccacttta catttcatac 



taagcagatt 
tcaaagatgg 
aagggactac 
gtgacgtcat 
ctagtggtct 
ccgtcaaatt 
acaatggaaa 
tggggtttga 
acggagttgt 



atctgagctt 
tcaagccatg 
gtgtagagag 
cgtacctgtt 
taaggcattt 
gcttgaccct 
agttaacttc 
gtgcaccaac 



gctcttgtct 
aaacaaaaat 
cacactttga 
ttcatggttg 
tatccattta 
ttctttggag 
ctagaagatg 
gtgagtgatt 
gttgtggtcg 
aagaagaacc 
cgtcgtctta 
ttttgccagg 
caagatcaat 
cctttaaaca 
agactcgtct 
ggtatccacc 
agaggttgtc 

gctctaagaa 

ctggctccga 

gagaaattgc 

ccttacttgc 

gctattgact 

gatcccattt 

gtctctggaa 

gaggtggcta 

ctcacgagaa 



tcttcttgta 
accaaaaatg 
tattcaacgt 
tggcattcga 
taagcttgat 
ttaaaaagga 
ttgggctcga 
taccacaagt 
gaagagacat 
ttgaaattgc 
ttggcatcac 
aaatttactt 
accctaaacc 
cactcgtatt 
tcggcctaaa 
ttactctcac 
tctttgtgtg 

agaaaaacat 

tcaagaccgt 

tgagccaggg 

ttcggtttag 

cacacgtgac 

tcttcctttt 

gtagctcgtc 

atcacgtgca 

gagataagta 



ccggtttttc 
cccttctcac 
cgaaggagct 
agccggaggg 
gagctacgaa 
aagcttccga 
gatgttccag 
tatgattgat 
gaaaatggtc 
ttttgataaa 
ttcctttaac 
cgtcagaaat 
attgattttc 
attcatgtgg 
cttaccttac 
cgtcaacaac 
taaccataga 

gaaagccgtg 

tagattgact 

agatctcgtg 

tccacttttc 

tttcttctat 

gaatcctttc 

cacgtgtcga 

gcatgagatc 

cttgatcttg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



caagaaaaaa taa 



<210> 11 
<211> 530 
<212> PRT 
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<213> Arabidopsis sp. 
<400> 11 

Met Ser Gly Asn Lys lie Ser Thr Leu Gin Ala Leu Val Phe Phe Leu 
15 10 15 

Tyr Arg Phe Phe lie Leu Arg Arg Trp Cys His Arg Ser Pro Lys Gin 
20 25 30 

Lys Tyr Gin Lys Cys Pro Ser His Gly Leu His Gin Tyr Gin Asp Leu 
35 40 45 

Ser Asn His Thr Leu lie Phe Asn Val Glu Gly Ala Leu Leu Lys Ser 
50 55 60 

Asn Ser Leu Phe Pro Tyr Phe Met Val Val Ala Phe Glu Ala Gly Gly 
65 70 75 80 

Val lie Arg Ser Leu Phe Leu Leu Val Leu Tyr Pro Phe He Ser Leu 
85 90 95 

Met Ser Tyr Glu Met Gly Leu Lys Thr Met Val Met Leu Ser Phe Phe 
100 105 110 

Gly Val Lys Lys Glu Ser Phe Arg Val Gly Lys Ser Val Leu Pro Lys 
115 120 125 

Tyr Phe Leu Glu Asp Val Gly Leu Glu Met Phe Gin Val Leu Lys Arg 
130 135 140 

Gly Gly Lys Arg Val Ala Val Ser Asp Leu Pro Gin Val Met He Asp 
145 150 155 160 

Val Phe Leu Arg Asp Tyr Leu Glu He Glu Val Val Val Gly Arg Asp 
165 170 175 

Met Lys Met Val Gly Gly Tyr Tyr Leu Gly lie Val Glu Asp Lys Lys 
180 185 190 

Asn Leu Glu He Ala Phe Asp Lys Val Val Gin Glu Glu Arg Leu Gly 
195 200 205 

Ser Gly Arg Arg Leu He Gly He Thr Ser Phe Asn Ser Pro Ser His 
210 215 220 

Arg Ser Leu Phe Ser Gin Phe Cys Gin Glu He Tyr Phe Val Arg Asn 
225 230 235 240 

Ser Asp Lys Lys Ser Trp Gin Thr Leu Pro Gin Asp Gin Tyr Pro Lys 
245 250 255 

Pro Leu He Phe His Asp Gly Arg Leu Ala Val Lys Pro Thr Pro Leu 
260 265 270 

Asn Thr Leu Val Leu Phe Met Trp Ala Pro Phe Ala Ala Val Leu Ala 
275 280 285 

Ala Ala Arg Leu Val Phe Gly Leu Asn Leu Pro Tyr Ser Leu Ala Asn 
290 295 300 

Pro Phe Leu Ala Phe Ser Gly He His Leu Thr Leu Thr Val Asn Asn 
305 310 315 320 

His Asn Asp Leu He Ser Ala Asp Arg Lys Arg Gly Cys Leu Phe Val 
325 330 335 

Cys Asn His Arg Thr Leu Leu Asp Pro Leu Tyr He Ser Tyr Ala Leu 
340 345 350 

Arg Lys Lys Asn Met Lys Ala Val Thr Tyr Ser Leu Ser Arg Leu Ser 



WO 00/18889 

355 

Glu Leu Leu 
370 

Lys Asp Gly 
385 

Val Cys Pro 



Ser Pro Leu 



Asp Ser His 
435 

Ala Phe Asp 
450 

Val Lys Leu 
465 

Gly Val Pro 



Gin His Glu 



Arg Arg Asp 
515 

Lys Lys 
530 



Ala Pro lie Lys 
375 

Gin Ala Met Glu 
390 

Glu Gly Thr Thr 
405 

Phe Ser Glu Val 
420 

Val Thr Phe Phe 



Pro lie Phe Phe 
455 

Leu Asp Pro Val 
470 

Asp Asn Gly Lys 
485 

lie Gly Asn Ala 
500 

Lys Tyr Leu lie 



10 

360 

Thr Val 

Lys Leu 

Cys Arg 

Cys Asp 
425 

Tyr Gly 
440 

Leu Leu 
Ser Gly 
Val Asn 



Leu Gly 
505 

Leu Ala 
520 



PCT/US99/22231 



Arg Leu Thr 
3 80 

Leu Ser Gin 
395 

Glu Pro Tyr 
410 

Val He Val 



Thr Thr Ala 



Asn Pro Phe 
460 

Ser Ser Ser 
475 

Phe Glu Val 
490 

Phe Glu Cys 



Gly. Asn Asn 



365 

Arg Asp Arg Val 



Gly Asp Leu Val 
400 

Leu Leu Arg Phe 
415 

Pro Val Ala He 
430 

Ser Gly Leu Lys 
445 

Pro Ser Tyr Thr 



Ser Thr Cys Arg 
480 

Ala Asn His Val 
495 

Thr Asn Leu Thr 
510 

Gly Val Val Lys 
525 



<210> 12 
<211> 1509 
<212> DNA 

<213> Arabidopsis sp. 



<400> 12 

atggttatgg 

atactgaaga 

ctaattcgtt 

agctacaaaa 

ccggagatcg 

atggacacgt 

cgagttatgg 

gaactgattg 

cagtctgctt 

ggaaaaccgg 

gcaccaatcc 

gtgatatttc 

ctcctttgga 

ctcccattgt 

ggaaagcctc 

agaaccctaa 

acttactcaa 

1020 

agaatccgag 
1080 

gtttgtcctg 
1140 

gctgagttaa 
1200 

gcgactacag 
1260 

ccggtttacg 
1320 



agcaagctgg 
acgcagattc 
tcgctatctt 
acgcagctct 
aatcagtggc 
ggagggtttt 
tggagaggtt 
taaaccggtt 
tgaaccgtgt 
ctttgaccgc 
cggagaacta 
acgacggaag 
tcccatttgg 
gggccacacc 
ctcagccacc 
tggaccctgt 
tctcgcgctt 



aacgacatcg 
attctcttac 
gttgtttcta 
caagctcaag 
tagagccgtt 
cagctcgtgt 
tgctaaggag 
cggttttgtc 
cgctaatttg 
ctctacaaat 
caaccacggt 
actagtgaag 
aatcattctc 
ttacgtctct 
ggcggctgga 
ggtattatct 
atcagagatc 



tattcggtcg 
ttcatgctcg 
tggcccgtaa 
atttttgtag 
ctgccaaaat 
aagaagaggg 
catcttagag 
accggtttga 
tttgttggtc 
ttcttatcgt 
gaccaacaac 
cggccaacgc 
gccgtgatcc 
cagatattcg 
aaatccggcg 
tatgtcctcg 
ttatctccca 



tgtcagagtt 
tagccttcga 
tcacactcct 
ccactgttgg 
tctacatgga 
tcgtggtcac 
cagatgaggt 
ttcgcgaaac 
ggaggcctca 
tatgtgagga 
ttcagctacg 
cggccaccgc 
ggatctttct 
gtggccatat 
tgctctttgt 
gacgtagcat 
ttccaaccgt 



tgaaggaaca 
agcagctggt 
tgacgttttc 
tctacgtgaa 
cgacgtaagc 
gagaatgcct 
catcggtacg 
ggatgttgat 
actaggtctt 
gcatattcat 
tccacttccg 
tctcatcatc 
tggagccgtc 
catcgtcaaa 
gtgtactcac 
cccagccgtt 
ccgattgaca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



atgtggatgc ggctaagatc aaacaacaac tgtcaaaagg agatctagtg 
agggaaccac ttgtcgtgaa ccgtttttgt taagattcag cgcgcttttc 
cggataggat tgttccggtt gcgatgaact acagagtcgg attcttccac 
cgagaggctg gaagggtttg gacccaattt tcttcttcat gaacccaaga 
agattacgtt cttgaaccag cttcctatgg aggcaacatg ttcgtccggg 



WO 00/18889 PCT/US99/22231 

aagagcccgc atgacgtggc gaactatgtt cagagaatct tggcggctac gttagggttt 
1380 

gagtgcacca acttcacaag aaaagataag tatagggttc tcgctggaaa cgatggaaca 
1440 

gtgtcgtact tgtcgttgct agaccaattg aagaaggtgg ttagcacttt cgagccttgt 
1500 

ctccattga 
1509 

<210> 13 
<211> 502 
<212> PRT 

<213> Arabidopsis sp . 
<400> 13 

Met Val Met Glu Gin Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 
15 10 15 

Phe Glu Gly Thr lie Leu Lys Asn Ala Asp Ser Phe Ser Tyr Phe Met 
20 25 30 

Leu Val Ala Phe Glu Ala Ala Gly Leu lie Arg Phe Ala lie Leu Leu 
35 40 45 

Phe Leu Trp Pro Val lie Thr Leu Leu Asp Val Phe Ser Tyr Lys Asn 
50 55 60 

Ala Ala Leu Lys Leu Lys lie Phe Val Ala Thr Val Gly Leu Arg Glu 
65 70 75 80 

Pro Glu lie Glu Ser Val Ala Arg Ala Val Leu Pro Lys Phe Tyr Met 
85 90 95 

Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lys Lys 
100 105 110 

Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Ala 
115 120 125 

Lys Glu His Leu Arg Ala Asp Glu Val lie Gly Thr Glu Leu lie Val 
130 135 140 

Asn Arg Phe Gly Phe Val Thr Gly Leu lie Arg Glu Thr Asp Val Asp 
145 150 155 160 

Gin Ser Ala Leu Asn Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 
165 170 175 

Gin Leu Gly Leu Gly Lys Pro Ala Leu Thr Ala Ser Thr Asn Phe Leu 
180 185 190 

Ser Leu Cys Glu Glu His He His Ala Pro He Pro Glu Asn Tyr Asn 
195 200 205 

His Gly Asp Gin Gin Leu Gin Leu Arg Pro Leu Pro Val He Phe His 
210 215 220 

Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu He He 
225 230 235 240 

Leu Leu Trp He Pro Phe Gly He He Leu Ala Val He Arg He Phe 
245 250 255 

Leu Gly Ala Val Leu Pro Leu Trp Ala Thr Pro Tyr Val Ser Gin He 
260 265 270 

Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gin Pro Pro Ala 
275 280 285 

Ala Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 



WO 00/18889 

290 

Asp Pro Val 
305 

Thr Tyr Ser 
Val Arg Leu 



Gin Leu Ser 
355 

Arg Glu Pro 
370 

Asp Arg lie 
385 



295 

Val Leu Ser Tyr 
310 

lie Ser Arg Leu 
325 

Thr Arg lie Arg 
340 

Lys Gly Asp Leu 



Phe Leu Leu Arg 
375 

Val Pro Val Ala 
390 



12 

Val Leu 

Ser Glu 

Asp Val 
345 

Val Val 
360 

Phe Ser 
Met Asn 



PCT/US99/22231 



300 

Gly Arg Ser 
315 

lie Leu Ser 
330 

Asp Ala Ala 
Cys Pro Glu 



Ala Thr Thr Ala Arg Gly Trp Lys Gly 
405 



Met Asn Pro 



Met Glu Ala 
435 

Tyr Val Gin 
450 

Phe Thr Arg 
465 

Val Ser Tyr 



Phe Glu Pro 



Arg Pro Val Tyr 
420 

Thr Cys Ser Ser 



Arg lie Leu Ala 
455 

Lys Asp Lys Tyr 
470 

Leu Ser Leu Leu 
485 

Cys Leu His 
500 



Glu lie 
425 

Gly Lys 
440 

Ala Thr 
Arg Val 
Asp Gin 



Ala Leu Phe 
380 

Tyr Arg Val 
395 

Leu Asp Pro 
410 

Thr Phe Leu 



Ser Pro His 



Leu Gly Phe 
460 

Leu Ala Gly 
475 

Leu Lys Lys 
490 



He Pro Ala Val 
320 

Pro He Pro Thr. 
335 

Lys He Lys Gin 
350 

Gly Thr Thr Cys 
365 

Ala Glu Leu Thr 



Gly Phe Phe His 
400 

He Phe Phe Phe 
415 

Asn Gin Leu Pro 
430 

Asp Val Ala Asn 
445 

Glu Cys Thr Asn 



Asn Asp Gly Thr 
480 

Val Val Ser Thr 
495 



<210> 14 
<211> 1563 
<212> DNA 

<213> Arabidopsis sp . 



<400> 14 

atgtccgcca 

cggcgatatc 

gacctatcac 

ctcttccctt 

ctcttcattc 

gtaatggtga 

cctaaatact 

aagaaaatcg 

tacttggaga 

ggtatcatgg 

agactaaaca 

ctattctctc 

caaaccctac 

atcaaaccaa 

gccgcagcag 

ctcgcctttt 

aaaccaagtc 

1020 

ctctatgttg 
1080 

agggtatctg 

1140 



agatttcaat 
ggaactctaa 
gccacacatt 
acttcatgtt 
tctatccatt 
gcttcttcgg 
ttctagaaga 
gagtgagtga 
ttgacgttgt 
aggataaaac 
ccggtcgtgt 
agttttgcca 
cacgaagcca 
ccctaatgaa 
ccagactctt 
ccggttgcag 
aacgcaaagg 

cattcgcttt 

agattttggc 



attccaagct 
accaaaatac 
gatcttcaac 
agtagcat tt 
gataagcttg 
gatcaaaaaa 
tgtcggactc 
tgatcttcct 
ggtcgggaga 
caaacatgat 
tattggcatc 
ggaaatttat 
gtaccctaaa 
cactttggtc 
cgtctctctt 
actaaccgtc 
ttgtctcttt 



cttgtctttc 
caaaatggcc 
gtagaaggag 
gaggcgggag 
atgagccatg 
gaaggttttc 
gagatcttcg 
caagttatga 
gaaatgaaag 
cttgtctttg 
acttccttca 
ttcgtgaaga 
ccattgattt 
ttgttcatgt 
tgcatccctt 
actaacgact 
gtatgtaacc 



tattctaccg 
cttcttctct 
ctcttctcaa 
gcgtaataag 
agatgggtgt 
gagcggggag 
aagtgttgaa 
tcgaagggtt 
tcgttggagg 
atgagttagt 
atacatctct 
aatcagacaa 
tccatgatgg 
ggggtcct tt 
actctttatc 
acgtttcatc 
ataggacttt 



gtttatcctc 
cctccaatcc 
atccgactct 
gtcatttctc 
caaagtgatg 
agcggttttg 
gagaggaggg 
cttgagagat 
ttattatcta 
tcgtaaagag 
tcaccgatat 
gcgaagctgg 
ccgtctcgcg 
cgcagccgca 
aatcccgatc 
tcaaaaacaa 
attggaccct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



gagaaagaaa aacatcaaaa ctgtaacgta tagtttgagt 
tccgatcaag acggtgagac tgacccgtga tcgggtgagc 
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gacggtcaag ccatggagaa attgttaacc gaaggagatc tcgttgtttg tcctgaagga 
1200 

accacttgta gagaacctta cctgcttagg tttagccctt tgttcaccga ggttagtgat 

12 60 

gtcatcgttc ccgtggctgt gacggtacac gtgaccttct tctacggtac aacggcgagt 
1320 

ggtcttaagg cacttgaccc gcttttcttc ctcttggatc cttatcctac ctacaccatc 

13 80 

caatttctcg accctgtctc cggtgccacg tgccaagatc ctgatggaaa gttgaagttt 
1440 

gaggtggcca acaatgttca gagtgatatt gggaaggcgc tggatttcga gtgcacaagt 
1500 

ctcactagaa aagacaagta tttgatcttg gccggtaata atggagtagt taagaaaaat 

1560 

taa 

1563 

<210> 15 
<211> 520 
<212> PRT 

<213> Arabidopsis sp. 
<400> 15 

Met Ser Ala Lys lie Ser lie Phe Gin Ala Leu Val Phe Leu Phe Tyr 
15 10 15 

Arg Phe lie Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gin Asn 
20 25 30 

Gly Pro Ser Ser Leu Leu Gin Ser Asp Leu Ser Arg His Thr Leu lie 
35 40 45 

Phe Asn Val Glu Gly Ala Leu Leu Lys Ser Asp Ser Leu Phe Pro Tyr 
50 55 60 

Phe Met Leu Val Ala Phe Glu Ala Gly Gly Val lie Arg Ser Phe Leu 
65 70 75 80 

Leu Phe lie Leu Tyr Pro Leu lie Ser Leu Met Ser His Glu Met Gly 
85 90 95 

Val Lys Val Met Val Met Val Ser Phe Phe Gly lie Lys Lys Glu Gly 
100 105 110 

Phe Arg Ala Gly Arg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 
115 120 125 

Gly Leu Glu lie Phe Glu Val Leu Lys Arg Gly Gly Lys Lys lie Gly 
130 135 140 

Val Ser Asp Asp Leu Pro Gin Val Met lie Glu Gly Phe Leu Arg Asp 
145 * 150 155 160 

Tyr Leu Glu lie Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 
165 170 175 

Gly Tyr Tyr Leu Gly lie Met Glu Asp Lys Thr Lys His Asp Leu Val 
180 185 190 

Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val lie 
195 200 205 

Gly lie Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gin 
210 215 220 

Phe Cys Gin Glu lie Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 
225 230 235 240 

Gin Thr Leu Pro Arg Ser Gin Tyr Pro Lys Pro Leu lie Phe His Asp 
245 °50 255 
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14 
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Gly Arg Leu Ala lie Lys Pro Thr Leu Met Asn Thr Leu Val Leu Phe 
260 "* 265 270 

Met Trp Gly Pro Phe Ala Ala Ala Ala Ala Ala Ala Arg Leu Phe Val 
275 280 285 

Ser Leu Cys He Pro Tyr Ser Leu Ser He Pro He Leu Ala Phe Ser 
290 295 300 

Gly Cys Arg Leu Thr Val Thr Asn Asp Tyr Val Ser Ser Gin Lys Gin 
305 310 315 320 

Lys Pro Ser Gin Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg Thr 
325 330 335 

Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn He 
340 345 350 

Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu He Leu Ala Pro 
355 360 365 

He Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gin Ala 
370 375 380 

Met Glu Lys Leu Leu Thr Glu Gly Asp Leu Val Val Cys Pro Glu Gly 
385 390 395 400 

Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 
405 410 415 

Glu Val Ser Asp Val He Val Pro Val Ala Val Thr Val His Val Thr 
420 425 430 

Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys Ala Leu Asp Pro Leu 
435 440 445 

Phe Phe Leu Leu Asp Pro Tyr Pro Thr Tyr Thr He Gin Phe Leu Asp 
450 455 460 

Pro Val Ser Gly Ala Thr Cys Gin Asp Pro Asp Gly Lys Leu Lys Phe 
465 470 475 480 

Glu Val Ala Asn Asn Val Gin Ser Asp He Gly Lys Ala Leu Asp Phe 
485 490 495 

Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu He Leu Ala Gly 
500 505 510 

Asn Asn Gly Val Val Lys Lys Asn 
515 520 



<210> 16 
<211> 1506 
<212> DNA 

<213> Arabidopsis sp . 



<400> 16 

atgggagctc aggagaaacg gcgccgtttc gagcagatat caaagtgcga tgttaaggac 60 
cggtccaacc ataccgtggc cgctgatcta gacggaacac tactaatctc tcgtagcgcc 120 
ttcccttact atttcctcgt agccctcgag gcagggagct tgctccgagc gttgatccta 180 
cttgtgtccg taccattcgt ttatcttacg tacttgacca tctccgagac tttagccatc 240 
aacgtatttg tcttcatcac gttcgcgggt ctcaagatcc gagacgttga gctagtggtc 3 00 
cgttccgtcc tcccgaggtt ctatgcggag gacgtgaggc ccgatacctg gcgtatcttc 3 60 
aacacgttcg ggaaacggta cataataact gcgagccctc gaattatggt cgagccattc 420 
gtgaaaacat tcctaggagt tgataaagtt cttggaacag agctagaggt ctccaaatcg 480 
ggtcgggcaa ccgggttcac cagaaaacca ggtattctcg tcggtcagta caaacgtgac 540 
gtcgttttga gagagtttgg tggcctagcg tctgatttac ctgatttggg gctcggcgat 600 
agcaagacgg accacgactt catgtccatc tgcaaggaag gttacatggt gccacgtacg 660 
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aaatgcgaac cattaccaag aaacaaactc ttaagcccca taatattcca cgagggcaga 720 
ttagtccaac gcccaacgcc gttagttgct ctgttaactt tcctctggct tcccgtcggt 780 
ttcgtcctct ctatcatccg cgtctacacg aatattccgt taccggaacg tatcgcccgt 840 
tacaactaca agcttactgg catcaagcta gtcgtcaacg gccaccctcc tccgccgcca 900 
aaacctggcc agccaggcca tcttttggtc tgcaaccacc gcaccgttct cgatcctgtg 960 
gtcacagctg tcgcactcgg ccggaaaatc agctgcgtca cttacagcat cagcaagttc 
1020 

tctgagctaa tctcaccaat caaagccgtt gcgttgactc gtcaacgtga gaaagacgca 
1080 

gcgaacatca agcgtctttt ggaggaaggc gatctcgtga tatgtcccga gggaaccacg 
1140 

tgccgtgagc ctttccttct ccggtttagt gctcttttcg ctgagctcac ggaccggatc 
1200 

gttcccgtgg cgatcaacac aaagcagagc atgttcaatg gtaccaccac acgtggatac 
1260 

aagcttcttg atccttactt tgcgttcatg aacccgaggc cgacgtatga gatcacgttc 
1320 

ctcaaacaga ttccagctga gctgacgtgt aaaggaggca aatctccgat agaggttgcg 
1380 

aattacatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 
1440 

aaggataagt acgcaatgct tgctggtact gacggtaggg ttccggtgaa gaaggagaag 

1500 

acgtga 

1506 

<210> 17 
<211> 500 
<212> PRT 

<213> Arabidopsis sp . 
<400> 17 

Met Gly Ala Gin Glu Lys Arg Arg Arg Phe Glu Gin lie Ser Lys Cys 
15 10 15 

Asp Val Lys Asp Arg Ser Asn His Thr Val Ala Ala Asp Leu Asp Gly 
20 25 30 

Thr Leu Leu He Ser Arg Ser Ala Phe Pro Tyr Tyr Phe Leu Val Ala 
35 40 45 

Leu Glu Ala Gly Ser Leu Leu Arg Ala Leu He Leu Leu Val Ser Val 
50 55 60 

Pro Phe Val Tyr Leu Thr Tyr Leu Thr He Ser Glu Thr Leu Ala He 
65 70 75 80 

Asn Val Phe Val Phe He Thr Phe Ala Gly Leu Lys He Arg Asp Val 
85 90 95 

Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 
100 105 HO 

Arg Pro Asp Thr Trp Arg He Phe Asn Thr Phe Gly Lys Arg Tyr He 
115 120 125 

He Thr Ala Ser Pro Arg He Met Val Glu Pro Phe Val Lys Thr Phe 
130 135 140 

Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 
145 150 155 160 

Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly He Leu Val Gly Gin 
165 170 175 

Tyr Lys Arg Asp Val Val Leu Arg Glu Phe Gly Gly Leu Ala Ser Asp 
180 185 190 



Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Asp His Asp Phe Met 
195 200 205 
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Ser lie Cys Lys Glu Gly Tyr Met Val Pro Arg Thr Lys Oys Glu Pro 
210 215 220 

Leu Pro Arg Asn Lys Leu Leu Ser Pro lie lie Phe His Glu Gly Arg 
225 230 235 240 

Leu Val Gin Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 
245 250 255 

Leu Pro Val Gly Phe Val Leu Ser lie lie Arg Val Tyr Thr Asn lie 
260 265 270 

Pro Leu Pro Glu Arg lie Ala Arg Tyr Asn Tyr Lys Leu Thr Gly lie 
275 280 285 

Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gin 
290 295 300 

Pro Gly His Leu Leu Val Cys Asn His Arg Thr Val Leu Asp Pro Val 
305 310 315 320 

Val Thr Ala Val Ala Leu Gly Arg Lys lie Ser Cys Val Thr Tyr Ser 
325 330 335 

lie Ser Lys Phe Ser Glu Leu lie Ser Pro lie Lys Ala Val Ala Leu 
340 345 350 

Thr Arg Gin Arg Glu Lys Asp Ala Ala Asn lie Lys Arg Leu Leu Glu 
355 360 365 

Glu Gly Asp Leu Val lie Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 
370 375 380 

Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr Asp Arg lie 
385 390 395 400 

Val Pro Val Ala He Asn Thr Lys Gin Ser Met Phe Asn Gly Thr Thr 
405 410 415 

Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Ala Phe Met Asn Pro 
420 425 430 

Arg Pro Thr Tyr Glu He Thr Phe Leu Lys Gin He Pro Ala Glu Leu 
435 440 445 

Thr Cys Lys Gly Gly Lys Ser Pro He Glu Val Ala Asn Tyr He Gin 
450 455 460 

Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 
465 470 475 480 

Lys Asp Lys Tyr Ala Met Leu Ala Gly Thr Asp Gly Arg Val Pro Val 
485 490 495 

Lys Lys Glu Lys 
500 



<210> 18 
<211> 1620 
<212> DNA 

<213> Arabidopsis sp . 
<400> 18 

atggcggatc ctgatctgtc ttctcctttg 
gttgttatct ctatcgccga cgacgacgac 
gttgttgacc ctcgtgtttc acgaggtttt 
ctcagcgagt cagagcctcc ggttctcggt 
acacctggag ttagcggatt gtacgaagcg 



atccaccatc aatcctccga tcaacctgaa 60 
gacgagtcag gactcaatct tottccagcc 120 
gagtttgacc atcttaatcc ttatggcttt 180 
ccgacgacgg tggatccatt ccggaacaat 240 
attaagctcg tgatttgtct tccgattgct 300 
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ctgattagac ttgttctctt tgctgctagc ttagctgttg gttacttggc tacaaaattg 3 60 
gcacttgctg gctggaaaga taaagagaac cctatgcctc tttggagatg cagaatcatg 420 
tggattactc ggatctgtac cagatgtatc ctcttctctt ttggctatca gtggataaga 480 
aggaaaggga aacctgctcg gagagagatt gctccgattg ttgtatcaaa tcatgtttct 540 
tatattgaac caatcttcta cttctatgaa ttatcaccga ccattgttgc atcggagtca 600 
catgattcac ttccatttgt tggaactatt atcagggcaa tgcaggtgat atatgtgaat 660 
agattctcac agacatcaag gaagaatgct gtgcatgaaa taaagagaaa agcttcctgc 720 
gatagatttc ctcgtctgct gttattcccc gaaggaacca cgactaatgg gaaagttctt 780 
atttccttcc aactcggtgc tttcatccct ggttacccta ttcaacctgt agtagtccgg 840 
tatccccatg tacattttga tcaatcctgg ggaaatatct ctttgttgac gctcatgttt 900 
agaatgttca ctcagtttca caatttcatg gaggttgaat atcttcctgt aatctatccc 960 
agtgaaaagc aaaagcagaa tgctgtgcgt ctctcacaga agactagtca tgcaattgca 
1020 

acatctttga atgtcgtcca aacatcccat tcttttgcgg acttgatgct actcaacaaa 
1080 

gcaactgagt taaagctgga gaacccctca aattacatgg ttgaaatggc aagagttgag 
1140 

tcgctattcc atgtaagcag cttagaggca acgcgatttt tggatacatt tgtttccatg 
1200 

attccggact cgagtggacg tgttaggcta catgactttc ttcggggtct taaactgaaa 
1260 

ccttgccctc tttctaaaag gatatttgag ttcatcgatg tggagaaggt cggatcaatc 
1320 

actttcaaac agttcttgtt tgcctcgggc cacgtgttga cacagccgct ttttaagcaa 
1380 

acatgcgagc tagccttttc ccattgcgat gcagatggag atggcfcatat tacaattcaa 
1440 

gaactcggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 
1500 

atgtaccatt tgctagacga cgaccaagat caaagaatca gccaaaatga cttgttgtcc 
1560 

tgcttaagaa gaaaccctct tctcatagcc atctttgcac ctgacttggc cccaacataa 
1620 

<210> 19 
<211> 539 
<212> PRT 

<213> Arabidopsis sp . 
<400> 19 

Met Ala Asp Pro Asp Leu Ser Ser Pro Leu lie His His Gin Ser Ser 
15 10 15 

Asp Gin Pro Glu Val Val He Ser lie Ala Asp Asp Asp Asp Asp Glu 
20 25 30 

Ser Gly Leu Asn Leu Leu Pro Ala Val Val Asp Pro Arg Val Ser Arg 
35 40 45 

Gly Phe Glu Phe Asp His Leu Asn Pro Tyr Gly Phe Leu Ser Glu Ser 
50 55 60 

Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 
65 70 75 80 

Thr Pro Gly Val Ser Gly Leu Tyr Glu Ala He Lys Leu Val He Cys 
85 90 95 

Leu Pro He Ala Leu He Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 
100 105 110 

Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 
115 120 125 

Glu Asn Pro Met Pro Leu Trp Arg Cys Arg He Met Trp He Thr Arg 
130 135 140 



He Cys Thr Arg Cys He Leu Phe Ser Phe Gly Tyr Gin Trp He Arg 
145 150 155 160 
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Arg Lys Gly Lys Pro Ala Arg Arg Glu lie Ala Pro lie Val Val Ser 

165 ~ 170 175 

Asn His Val Ser Tyr lie Glu Pro lie Phe Tyr Phe Tyr Glu Leu Ser 
180 185 190 

Pro Thr lie Val Ala Ser Glu Ser His Asp Ser Leu Pro Phe Val Gly 
195 200 205 

Thr lie lie Arg Ala Met Gin Val lie Tyr Val Asn Arg Phe Ser Gin 
210 215 220 

Thr Ser Arg Lys Asn Ala Val His Glu lie Lys Arg Lys Ala Ser Cys 
225 230 235 240 

Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 
245 250 255 

Gly Lys Val Leu lie Ser Phe Gin Leu Gly Ala Phe lie Pro Gly Tyr 
260 265 270 

Pro lie Gin Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gin 
275 280 285 

Ser Trp Gly Asn lie Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 
290 ^ 295 300 

Gin Phe His Asn Phe Met Glu Val Glu Tyr Leu Pro Val lie Tyr Pro 
305 310 315 320 

Ser Glu Lys Gin Lys Gin Asn Ala Val Arg Leu Ser Gin Lys Thr Ser 
325 330 335 

His Ala lie Ala Thr Ser Leu Asn Val Val Gin Thr Ser His Ser Phe 
340 345 350 

Ala Asp Leu Met Leu Leu Asn Lys Ala Thr Glu Leu Lys Leu Glu Asn 
355 360 365 

Pro Ser Asn Tyr Met Val Glu Met Ala Arg Val Glu Ser Leu Phe His 
370 ~ 375 380 

Val Ser Ser Leu Glu Ala Thr Arg Phe Leu Asp Thr Phe Val Ser Met 
385 390 395 400 

lie Pro Asp Ser Ser Gly Arg Val Arg Leu His Asp Phe Leu Arg Gly 
405 410 415 

Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg lie Phe Glu Phe lie 
420 425 430 

Asp Val Glu Lys Val Gly Ser lie Thr Phe Lys Gin Phe Leu Phe Ala 
435 440 445 

Ser Gly His Val Leu Thr Gin Pro Leu Phe Lys Gin Thr Cys Glu Leu 
450 455 460 

Ala Phe Ser His Cys Asp Ala Asp Gly Asp Gly Tyr He Thr He Gin 
465 470 475 480 

Glu Leu Gly Glu Ala Leu Lys Asn Thr He Pro Asn Leu Asn Lys Asp 
485 490 495 

Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gin Asp Gin Arg 
500 505 510 

He Ser Gin Asn Asp Leu Leu Ser Cys Leu Arg Arg Asn Pro Leu Leu 
515 520 525 

He Ala He Phe Ala Pro Asp Leu Ala Pro Thr 
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<210> 20 
<211> 1128 
<212> DNA 

<213> Arabidopsis sp . 



<400> 20 

atggaaaaaa 

ataatatgtc 

ttatcagctg 

ttctttggct 

gttatcttct 

cgaacagaag 

aatatcaaat 

cacctctttg 

cagatagttt 

ggcacagatt 

cttccgatac 

gaactgagtt 

ccatctttct 

cgtatcaacc 

acattccagc 

gaaggaacag 

gccttcacca 

1020 

tatgtctctt 
1080 

ccacttgttg 
1128 



agagtgtacc 
tgatggtgtt 
tagtgttgag 
cgtggctcgc 
ctggtgataa 
ttgattggat 
atgtgcttaa 
agtttattcc 
cgagttttaa 
acacagaggc 
tgaacaacgt 
gctcacttga 
tagacaacgt 
tgacccaaat 
tcaaagacca 
agaaagagtt 
ccatctgtac 



aaattctgat 
agtttcaaca 
gcttttcagc 
cttgtggcct 
ggttccttgc 
gtacttctgg 
gagtagtttg 
tgttgagagg 
ggatccccga 
taaatgccaa 
gctgcttccc 
cgcagtttat 
ttatggaatt 
cccaaatcaa 
gctgctcaat 
caacacaaag 
acatctcacc 



aagttgtctc 
gcttttatga 
attcgctata 
ttcctctttg 
gaggatcgag 
gatcttgcac 
atgaaattac 
agatgggaag 
gacgctttat 
aggagtaaga 
aggacaaaag 
gatgtgacca 
gagccatcag 
gaaaaggaca 
gacttttact 
aagtacctca 
ttcttctcat 



tgattagagt 
tgttgatatt 
gccgtaaatg 
agaagattaa 
tattgctcat 
tgcgtaaagg 
ctctctttgg 
tcgatgaagc 
ggcttgctct 
aatttgctgc 
gtttcgtctc 
tcggttataa 
aagttcacat 
tcaatgcttg 
ccaatggtca 
taaactgttt 
caatgatttg 



gttaagaggt 
ctgggggttc 
tgtttccttc 
caaaaccaaa 
tgcaaaccac 
ccagattggg 
ttgggcgttt 
aaacttgaga 
tttccccgag 
tgaaaatggc 
ctgcttgcaa 
aacccgctgc 
ccacatccgt 
gttaatgaac 
tttccctaac 
ggcagtgatt 
gttcaggatt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



tggcctgtgt ctacttgacc tctgctacgc atttcaatct tcgttctgtt 
agactgcaaa aaattccctc aaattagtaa acaaataa 



<210> 21 
<211> 375 
<212> PRT 

<213> Arabidopsis sp . 
<400> 21 

Met Glu Lys Lys Ser Val Pro Asn Ser Asp Lys Leu 
15 10 

Val Leu Arg Gly lie lie Cys Leu Met Val Leu Val 
20 25 

Met Met Leu lie Phe Trp Gly Phe Leu Ser Ala Val 
35 40 



Ser Leu lie Arg 
15 

Ser Thr Ala Phe 
30 

Val Leu Arg Leu 
45 



Phe Ser lie Arg Tyr Ser Arg Lys Cys Val Ser Phe Phe Phe Gly Ser 
50 55 60 



Trp Leu Ala Leu Trp Pro Phe Leu Phe Glu Lys lie 
65 70 75 

Val lie Phe Ser Gly Asp Lys Val Pro Cys Glu Asp 
85 90 

lie Ala Asn His Arg Thr Glu Val Asp Trp Met Tyr 
100 105 

Ala Leu Arg Lys Gly Gin lie Gly Asn lie Lys Tyr 
115 120 



Asn Lys Thr Lys 
80 

Arg Val Leu Leu 
95 

Phe Trp Asp Leu 
110 

Val Leu Lys Ser 
125 



Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Ala Phe His Leu Phe Glu 
130 135 140 



Phe lie Pro Val Glu Arg Arg Trp Glu Val Asp Glu 
145 150 155 



Ala Asn Leu Arg 
160 



Gin He Val Ser Ser Phe Lys Asp Pro Arg Asp Ala Leu Trp Leu Ala 
165 170 175 
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Leu Phe 

Lys Lys 

Leu Pro 
210 

Ser Leu 
225 

Pro Ser 
lie His 
Asp He 



Leu Asn 
290 

Lys Glu 
305 



Pro Glu 
180 

Phe Ala 
195 

Arg Thr 

Asp Ala 

Phe Leu 

He Arg 
260 

Asn Ala 
275 

Asp Phe 
Phe Asn 



Ala Phe Thr Thr 
Trp Phe 
Thr His 



Arg He 
340 



Ser Leu 
370 



Phe Asn 
355 

Lys Leu 



Gly 

Ala 

Lys 

Val 

Asp 
245 

Arg 

Trp 

Tyr 

Thr 

He 
325 

Tyr 
Leu 
Val 



Thr Asp Tyr Thr 
185 

Glu Asn Gly Leu 
200 

Gly Phe Val Ser 
215 

Tyr Asp Val Thr 
230 

Asn Val Tyr Gly 



He Asn Leu Thr 
265 

Leu Met Asn Thr 
280 

Ser Asn Gly His 
295 

Lys Lys Tyr Leu 
310 

Cys Thr His Leu 



Val Ser Leu Ala 
345 

Arg Ser Val Pro 
360 

Asn Lys 
375 



Glu Ala 

Pro He 

Cys Leu 

He Gly 
235 

He Glu 
250 

Gin He 

Phe Gin 

Phe Pro 

He Asn 
315 

Thr Phe 
330 

Cys Val 
Leu Val 



Lys Cys Gin 
190 

Leu Asn Asn 
205 

Gin Glu Leu 
220 

Tyr Lys Thr 

Pro Ser Glu 

Pro Asn Gin 
270 

Leu Lys Asp 
285 

Asn Glu Gly 
300 

Cys Leu Ala 
Phe Ser Ser 



Tyr Leu Thr 
350 

Glu Thr Ala 
365 



Arg Ser 

Val Leu 

Ser Cys 

Arg Cys 
240 

Val His 
255 

Glu Lys 

Gin Leu 

Thr Glu 

Val He 
320 

Met lie 
335 

Ser Ala 
Lys Asn 



<210> 22 
<211> 1170 
<212> DNA 

<213> Arabidopsis sp. 



<400> 22 

atggtgattg 

gctgtcaatc 

tacagaaaaa 

gactggtggg 

ggcaaagaac 

tggattctgg 

tccaaattcc 

agaaattggg 

cctcgacctt 

aaagccgcac 

cctcgcacca 

tatgatatga 

aaaggacaac 

gaatcagatg 

ttagacaaac 

cccataaagt 

aagttcctac 

1020 

ggtctaggta 
1080 

tcgaccccag 
1140 



ctgcagctgt 
tctttcaggc 
ttaaccgggt 
ctggagttaa 
atgctcttgt 
ctcagcggtc 
ttccagtcat 
ccaaggatga 
tctggttagc 
aagagtatgc 
aaggtttcgt 
cagtgactat 
cttcagtggt 
acgcaattgc 
acatagctgc 
cccttgcggt 
actgggcaca 



catcgtgcct 
agtttgctat 
ggttgcagaa 
gatccaagtg 
cgtttgtaat 
aggttgcctg 
aggctggtca 
aagcactcta 
cctttttgtg 
agcctcctct 
gtcagctgtt 
tccaaaaacc 
gcatgttcac 
acagtggtgc 
agacactttc 
ggttctatca 
actcttttct 



ttgggccttc 
gtactcattc 
accttgtggt 
tttgctgata 
caccgaagtg 
ggaagcgcat 
atgtggttct 
aagtcaggtc 
gagggaactc 
gaattgccta 
agtaatatgc 
tctccaccac 
atcaagtgtc 
agagatcagt 
cccggtcaac 
tgggcatgcg 
tcatggaaag 



tcttcttcat 
gaccactgtc 
tggagcttgt 
atgagacctt 
atattgattg 
tagctgtaat 
cggagtatct 
ttcagcgctt 
gctttacaga 
tccctcgaaa 
gttcatttgt 
ccacgatgct 
actcgatgaa 
ttgtggctaa 
aagaacagaa 
tactaactct 
gtatcacgat 



atctggtctc 
taagaacaca 
atggatagtt 
caatcgaatg 
gcttgtggga 
gaagaagtct 
ctttctggaa 
gagcgacttc 
agccaaactt 
tgtgttgatt 
cccagcaatt 
aagactattc 
agacttacct 
ggatgctctg 
cattggccgt 
tggagcaata 
atcggcgctt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



tcatcactct ctgtatgcag atcctgatac gctcgtctca gtcagagcgt 
ccaaagtcgt cccagccaag ccaaaagaca atcaccaccc agaatcatcc 
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tcccaaacag aaacggagaa ggagaagtaa 
1170 

<210> 23 
<211> 389 
<212> PRT 

<213> Arabidopsis sp . 
<400> 23 

Met Val lie Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe Phe 
15 10 15 

lie Ser Gly Leu Ala Val Asn Leu Phe Gin Ala Val Cys Tyr Val Leu 
20 25 30 

lie Arg Pro Leu Ser Lys Asn Thr Tyr Arg Lys lie Asn Arg Val Val 
35 40 45 

Ala Glu Thr Leu Trp Leu Glu Leu Val Trp lie Val Asp Trp Trp Ala 
50 55 60 

Gly Val Lys He Gin Val Phe Ala Asp Asn Glu Thr Phe Asn Arg Met 
65 ~ 70 75 80 

Gly Lys Glu His Ala Leu Val Val Cys Asn His Arg Ser Asp He Asp 
85 90 95 

Trp Leu Val Gly Trp He Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 

Ala Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val He Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Asn Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Ser Gly Leu Gin Arg Leu Ser Asp Phe 
145 150 155 160 

Pro Arg Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Glu Ala Lys Leu Lys Ala Ala Gin Glu Tyr Ala Ala Ser Ser Glu Leu 
180 185 190 

Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser Asn Met Arg Ser Phe Val Pro Ala He Tyr Asp Met Thr 
210 215 220 

Val Thr He Pro Lys Thr Ser Pro Pro Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Val Val His Val His He Lys Cys His Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ser Asp Asp Ala He Ala Gin Trp Cys Arg Asp 
260 265 270 

Gin Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His He Ala Ala Asp 
275 280 285 

Thr Phe Pro Gly Gin Gin Glu Gin Asn He Gly Arg Pro He Lys Ser 
290 295 300 

Leu Ala Val Val Leu Ser Trp Ala Cys Val Leu Thr Leu Gly Ala He 
305 310 315 320 

Lys Phe Leu His Trp Ala Gin Leu Phe Ser Ser Trp Lys Gly He Thr 
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325 



330 



335 



lie Ser Ala Leu Gly Leu Gly lie lie Thr Leu Cys Met Gin lie Leu 
340 345 350 

lie Arg Ser Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Val Pro 
355 360 365 

Ala Lys Pro Lys Asp Asn His His Pro Glu Ser Ser Ser Gin Thr Glu 
370 375 380 

Thr Glu Lys Glu Lys 
385 



<210> 24 

<211> 269 

<212> DNA 

<213> Glycine max 

<400> 24 

gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 

ataagggtct acttcaacct ccctctccca gaacncattg tccgctacac ctacgagatg 120 

ctcggcatca acctcgtcat ccgcggccac cgccctcctc cgccttcccc cggcaccccc 180 

ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 240 
ctcggccgca aggtctcctg cgtcaccta 269 

<210> 25 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 25 

tgatcttcca cgacggccgt ttcgtgcaga ggccagaccc actgaacgct ctcatcacct 60 
tcacgtggct ccccttcggc ttcatcctct ccatcataag ggtctacttc aaccttcctc 120 
tcccagaacg cattgtccgc tacacctacg agatgctcgg catcaacctc gtcatccgcg 180 
gccaccgccc tcctccgcct tcccccggca cccccggcaa cctctacgtc tgcaaccacc 240 
gc 242 

<210> 26 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 26 

gtttgttcaa aggccaactc ctctagcagc cctcttgacc ttcctatggt tgccaattgg 60 
catcatactc tccatnctta agggtctacc ttaacatccc tttgcctgaa agaattgctt 120 
ggtataacta taagctatta ggaatcagag ttattgtgaa gggtacccct ccaccacccc 180 
caaagaaggg tcaaagtggt gtcctatttg tttgtaacca ccgcacagtt ttagaccctg 240 
tggttactgc agttgcactt ggaagaaaaa tt 272 

<210> 27 

<211> 218 

<212> DNA 

<213> Glycine max 

<400> 27 

atagcacagg agggttacat ggtgcctccg agcaaatcag caaaggcagt cccacaggag 60 
cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 120 
aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 180 
tacttcaacc tccctctccc agaacgcatc gtccgcta 218 

<210> 28 

<211> 270 

<212> DNA 

<213> Glycine max 



<400> 28 

gtgcctgttg ctgtgaactg caagcagaac atgttctttg gaaccaccgt tcgtggcgtc 60 
aagttctggg acccttaact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 
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ttaccttgat acctttgccg aggagatgtc ggttaaggct ggggggaagt cgtccattga 180 
ggtggccaac cacgtggcag aaggtgctgg gggatgtgtt agggtttgag tgcaccgggt 240 
tgactaggaa ggataagtat atgttgttgg 270 

<210> 29 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 29 

catgagggta ggtttgctca aaggccaact cctctagctg ccctcttgac cttcctatgg 60 
ctgccaattg gcatcatact ctccatctta agggtctacc ttaacatccc tttgcctgaa 120 
agaattgttg gtacaactac aagctcttag gaatcagagt tattgtgaag ggtacccctc 180 
caccgccccc aaagaagggt caaagtggtg tctatttgtt tgtaaccacc gcacagtatt 240 
agaccctgtt gt 252 

<210> 30 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 30 

ctgggactgc cttaaacgat gcatggatct tatcaagaaa ggagcctctg tttttttctt 60 
tccagaggga acacgcagta aagatggaag actaggcaca ttcaagaagg gtgctttcag 120 
tgttgctgca aagacaaatg caccagtagt accaattacc cttattggaa ctggtcaaat 180 
catgcctgca ggaaaggagg gaatagtgaa cataggttct gtgaaagtgg ttatacataa 240 
acctattgtt ggaaaggatc ctgacatgtt at 272 

<210> 31 

<211> 239 

<212> DNA 

<213> Glycine max 

<400> 31 

cgggaatcaa ggtcatcaga cttcaagggt gtttcagctg ttgtcactga cagaattcga 60 
gaagctcatc agaatgagtc tgctccatta atgatgttat ttccagaagg tacaaccaca 120 
aatggagagt tcctccttcc attcaagact ggtggttttt tggcaaaggc accggtactt 180 
cctgtgatat tacgatatca ttaccagaga tttagccctg cctgggattc catatctgg 239 

<210> 32 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 32 

gaacggcaac ggcaacagcg ttcgcgatga ccgtcctctg ctgaagccgg agcctccggt 60 
cttccgccga cagcatcgcc gatatggaga agaagttcgc cgcttacgtc cgccgctacg 12 0 
tgtacggcac catgggacgc ggcgagttgc ctcccaagga gaagctcttg ctcggtttcg 180 
cgttggtcac tcttctcccc attcgagtcg ttctcgccgt caccatattg ctcttttatt 240 
ac 242 

<210> 33 

<211> 248 

<212> DNA 

<213> Glycine max 

<400> 33 

ttcttcttct ctcactctct aaaaccctaa ctctatacat ggaagggaaa nctcaaatct 60 
natgactaat taattaatcc atcgatcaag catggagtcc gaactcaaag acctcaattc 120 
gaagccgccg aacggcaacg gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga 180 
gcctccggtc tccgccgaca gcatcgccga tatggagaag aagttcgccg cttacgtccg 240 
ccgcgacg 248 



<210> 34 

<211> 217 

<212> DNA 

<213> Glycine max 



<400> 34 

aaaaccctaa ttctatacat ggaagggaaa tctcaaatct aatgactaat taattaatcc 60 
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atcgatcaag catggagtcc gaactcaaag acctcaattc gaagccgccg aacggcaacg 120 
gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga gcctccggtc tccgccgaca 180 
gcatcgccga tatggagaag aagttcgccg cttacgt 217 

<210> 35 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 35 

atctctgtct ctgcatttcc ctccctaaaa ccctaattct acatttggaa aggaaatctc 60 
aaatctaatg actaattaat caatcaatcg tattaataat ccatcgatca agtatggagt 120 
ccgaactcaa agacctcaat tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt 180 
gcgacgaccg tcctctgctg aagccggagc ctccggcctc ctccgacagc atcgccgaga 240 
tggagaagaa gttcgcc 257 

<210> 36 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 36 

cccgaccaaa acaggttttt gtggccaatc atacttccat gattgatttc attatcttag 60 
aacagatgac tgcatttgct gttattatgc agaagcatcc tggatgggtt ggattattgc 120 
agagcaccat tntggagagt gtagggtgta tctggttcaa ccgtacagag gcaaaggatc 180 
gagaagttgt ggcaaggaaa ttgagggatc atgtcctggg agctaacaac aaccctcttc 240 
ttatatttcc tgaaggaact tgtgtaaata atcactactc gtca 284 

<210> 37 

<211> 246 

<212> DNA 

<213> Glycine max 

<400> 37 

ggagatccgc ataagcaaat caatcatcct gttccttcct tatctctgtc tctgcatttc 60 
cctccctaaa accctaattc tacatttgga aaggaantct caaatctaat gataattaat 120 
caatcaatcg tattaataat ccatcgatca agtatggagt ccgaactcaa agacctcaat 180 
tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt gcgacgaccg tcctctgctg 240 
aagccg 246 

<210> 38 

<211> 278 

<212> DNA 

<213> Glycine max 

<400> 38 

gttttctatt gccacgttgt ggaagcgtaa cgaagatgaa tggcattggg aaactcaaat 60 
cgtcgagttc tgaattggac cttcacattg aagattacct accttctgga tccagtgttc 120 
aacaagaacg gcatggcaag ctccgactgt gtgatttgct agacatttct cctagtctat 180 
ctgaggcagc acgtgccatt gtagatgata cattcacaag gtgcttcaag caaatcctcc 240 
agaaccttgg aactggaatg tttatttgtt tcctttgt 278 

<210> 39 

<211> 312 

<212> DNA 

<213> Glycine max 



<400> 39 

ttaactttgg 

cagaggtctt 

aagnatcatg 

tcatgattga 

atcctggatg 

acttgcgtct 



cacattctcc 
tggtaganat 
gacccaggcc 
tntcattatn 
ggttggtaag 
tc 



ttttgttcat 
gatgtgcagt 
tagcaggaga 
tnagaacaga 
cntacagnat 



caatgtgtgt 
ttctgtggtg 
ccaaagcagg 
tgactgcttt 
gtcaacngtg 



tgtaaattgt 
catcttggac 
tttttgtagc 
tgcngttatn 
tatnaaatat 



ncatttcctt 60 
tgnggntgtt 120 
caaccatact 180 
atgcagaagc 240 
gntacacnnn 3 00 
312 



<210> 40 

<211> 255 

<212> DNA 

<213> Glycine max 
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ggattattgn ngcanatgca gtcatctgtt ctaagataat ganatcnatc atggaagtat 60 

gattggncac anaaacctgt yttttggttg gatactaggt cttggcccat ggtacttgac 120 

naccccagtc catgatgcaa canaganact gnacatcatc tccaccaaac ccctctgana 180 

ganacgagaa ttgagcaatt tagagtacct tggtttgatg caagtcagta tattcaagtt 240 
tctattcatc aaagg 255 



<210> 41 

<211> 291 

<212> DNA 

<213> Glycine max 

<400> 41 

caacctccca tgcaatcgct caccctctcc gtcacctgaa tctgttttct attccctccg 60 
tcgcgtaaca aggatgaatg gcattgggaa actcaaatcg tcgagttctg aattggacct 120 
tcacattgaa gattacctgc cttctggatc cagtgttcaa caagaacggc atggcaagct 180 
ccgcctgtgt gatttgctag acatttctcc tagtctatct gaggcagcac gtgccattgt 240 
agatgataca ttcacaaggt gcttcaagtc aaatcctcca gaaccttgga a 291 

<210> 42 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 42 

ctgcaaccta ccatgcaatt cctcacctga atccgttttc tattgccacg ttgtggaagc 60 
gtaacgaaga tgaatggcat tgggaaactc aaatcgtcga gttctgaatt ggaccttcac 120 
attgaagatt acctaccttc tggatccagt gttcaacaag aacggcatgg caagctccga 180 
ctgtgtgatt tgctagacat ttctcctagt ctatctgagg cagcacgtgc catgtagatg 240 
atacatcaca aggtgctcaa gtcaaatctc cagaaccttg gaat 284 



<210> 43 

<211> 268 

<212> DNA 

<213> Glycine max 



<400> 43 

ctgaagtatt ctcgtcctag cccaaagcat agagaaaggn agcaacagaa ctttgctgag 6 0 
tcagtgctgc ggcgatggga ggaaaagtga tgtgtacctt tatgtggtgt tgttcttaat 120 
tattcttagt aatgccattg cttcgacccc tttttttgct tttgttttgt cattgctaac 180 
tatttatttt taacactttt attaaagata tggcatatat ncacttcagt anacaaagtt 240 
gtnccagtaa tttnttttcc aaaaaaaa 2 68 



<210> 44 

<211> 241 

<212> DNA 

<213> Glycine max 



<400> 44 

gancaaaatt gccctccatc actttccttg 
attccctcac ctgaatccgt tttctattgc 
gcattgggaa actcaaatcg tcgagttctg 
cttctggatc cagtgttcaa caagaacggc 
a 



ttagagttgg tttctgcnac ctaccatgca 60 
cacgttgtgg aagcgtaacg aagatgaatg 120 
aattggacct tcacattgaa gattacctac 180 
atggcaagct ccgactgtgt gatttgctag 240 

241 



<210> 45 

<211> 247 

<212> DNA 

<213> Glycine max 



<400> 45 

gtaggatgtc tgagatcctt gccccaatca aaacggtgcg gttaactaga aaccgcgacg 60 
aggatgcgaa aatgatgaaa aatttgctgg ggcaagggga cctggtggtt tgtcctgaag 120 
ggaccacatg tagagaacct tatttattga ggttcagccc tctgttctca gagatgtgcg 180 
atgagattgt ccccgttggc agttgattcc cagttatatg ttccacggaa ccactgctgg 240 
tgganta 247 



<210> 46 
<211> 271 
<212> DNA 
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<213> Glycine max 



<400> 46 

tgcagggggg cttgttagag ccatagtttt 
aggaaaagag atggggttga agataatggt 
gagcttcaga gttggaaggt ccgttttgcc 
aatgtttgag gcactcaaaa aaggagggaa 
gatggtggaa agcttcttga gagagtattt 

<210> 47 

<211> 242 

<212> DNA 

<213> Glycine max 



ggttcttcta tacccttttg tttgtgtcgt 60 
catggcatgc ttcttcggga tcaaagcatc 120 
cnaattcttc tnggaggacg ttngtgcaga 180 
gacagtggga gttaccaatt taccccacgt 240 
a ~ 271 



<400> 47 

ttcacagctg tcacgccgtn aacggaaaat ggcaacggcg agacgcagtt tcccgcctat 60 

caccgaatgc aacggaacga cnccgtgcga ntctgtngnc gccgacctcg agggtacgct 120 

cctcatctcc cgtngctcgt tcccgtactt catgctcgtc gccgtcgaag ccggcagcnt 180 

cctccgcggc ctcatgctnc tcctctccct tccgttcgtc atnatcgcct acctcttcat 240 

ct 242 

<210> 48 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 48 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 
ctctctctct gtcatggtca ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 
tagacccaag accgctccaa ccagaccatc gcctcggacc tcgatggcac cctccttgtc 180 
tcccggagtg ccttccccta ctacttcctc gtcgccctcg aagccggcag cgtcttccga 240 
gcct 244 

<210> 49 

<211> 230 

<212> DNA 

<213> Glycine max 



<400> 49 

caacattcca cctagctccc caatcacatc 
ctcttcattt tctcctctat tgtcataatc 
accacccaag accggtccaa ccagaccgtg 
tcccggagcg ccttccccta ctacctcctc 

<210> 50 

<211> 265 

<212> DNA 

<213> Glycine max 



ttcaccacac cataaacctt cttaatttct 60 
atggggacct tccctcgctt cgacccaatc 120 
gcctccgacc ttgacggcac cctcctcgtc 180 
gttgccctcg aagccggcag 230 



<400> 50 

ctggtgaata atcctaagtt atggagtctg 
aggacaagga tgcgttctca tacttcatgt 
gtttcgcctt gttgctaaca ctattgcccg 
acgatgcatc tctcaagcta ntnatcttcg 
ttgaatcagt ggctagggca gtttt 



tggtgtgtga gctagaaggc acgcttgtga 60 
tggttgcgtt tgaagcttca ggtttggttc 120 
tgattcggtt ccttgacatg gttggcatga 180 
tggctgtggc tggtgttcca aagtccgaga 240 

265 



<210> 51 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 51 

ctggtgaata atcctaagtt atggagtctg 
aggacaagga tgcgttctca tacttcatgt 
gtttcgcctt gttgctaaca ctattgcccg 
acgatgcatc tctcaagcta atgatcttcg 
tgaatcagtg gc 



tggtgtgtga gctagaaggc acgcttgtga 60 
tggttgcgtt tgaagcttca ggtttggttc 120 
tgattcggtt ccttgacatg gttggcatga 180 
tggctgtggc tgggttccaa agtccgagat 240 

252 



<210> 52 
<211> 218 



4 
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<212> DNA 

<213> Glycine max 



<400> 52 

aactgcaact acaacaacat 
aacggcgaga cgcagtttac 
tctgtggccg ccgacctcga 
atgctcgtcg ccgtcgaagc 



tcattcattc acagctgtca 
ccgcctatac accgaatgca 
cggtacgctc ctcatntccc 
cggcagcctc ctccgcgg 



cgccgtgaac ggaaaatggc 60 
acggaacgac accgtgcgag 120 
gtagctcgtt cccgtacttc 180 

218 



<210> 53 

<211> 262 

<212> DNA 

<213> Glycine max 



<400> 53 

ggttaaggac attgagatgg tcgnntcctc 
gcnccccgag agctggagag tcttcaatcc 
ctagggtgat ggtggagcan tttgttaaga 
ctgagcttga ggccacgaaa tcggggaggt 
tgttggggag cacaagaaag tg 

<210> 54 

<211> 212 

<212> DNA 

<213> Glycine max 



ggtgctgccc aagttctaca cc^aggacgt 60 
ttcgggaagc gttacattgt cactgctagt 120 
cgtttcttgg ggctgataag gtgcttggga 180 
tcatgggttt gttaaggagc ctggtgtgct 240 

262 



<400> 54 

gcaactacaa caacattcat 
gcgagacgca gtttcccgcc 
gccgccgacc tcgacggtac 
gtngccgtcg aagccggcag 



tcattcacag ctgtcacgcc 
tatcaccgaa tgcaacggaa 
gctcctcatc tcccgtagnc 
cctcctccgc gg 



gtgaacggaa aatggcaacg 60 
cgacgccgtg cgagtctgtg 120 
cgttcccgta cttcatgctc 180 

212 



<210> 55 

<211> 273 

<212> DNA 

<213> Glycine max 



<400> 55 

catggttttc ttgagcttct ttggcctcag aaaggacaca ttcagaacag gatcagctgt 60 
tctggcaaag ttcttcttag aagatgttgg attggaaggc tttgaggccg taatatgttg 120 
tgagagaaaa gtggcatcta gtaagttgcc aagggtcatg gttgaaaatt tcctcaagga 180 
ctatttaggg gttgatgctg ttatagcaag agaattgaag tcctttagtg gcttcttttt 240 
gggagttttt gagagtaaga agccaattaa aat 273 



<210> 56 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 56 

ctctcaaaaa aggagggaag acagtgggag tcaccaatct accccatgtg atggtggaaa 60 
gcttcttgag agagtatttg gacattgatt tcgttgtggg cagggagctg aaagttttct 120 
gtggatacta cgtaggattg atggatgaca caaaaactat gcatgccttg gagctggtta 180 
aagaaggaaa aggatgctcc gacatgatcg gaatcacaag gtttcgcaac atacgcgacc 240 
atgatgattt tttctcc 257 

<210> 57 

<211> 240 

<212> DNA 

<213> Glycine max 



<400> 57 

gaactaagtg tgaaccacta ccaagaaaca agcttttaag tccaattatt tttcatgagg 60 
gtaggtttgc tcaaaggcca actcctctag ctgnnctctt gaccttccta tggctgccaa 120 
ttggcatcat actctccatc ttaagggtct accttaacat ccctttgcct gaaagaattg 180 
cttggtacaa ctacaagctc ttaggaatca gagttattgt gaagggtacc cctccaccgc 240 



<210> 58 
<211> 254 
<212> DNA 
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<213> Glycine max 
<400> 58 

cttggaataa gggtcattag gaagggtatc 
ggagtcctat ttgtatgcaa ccacaggaca 
ttaggaagga aaattagctg tgtcacatat 
ccaatcaaag ctgtggcact ctctagggag 
ttgcttgagg aagg 

<210> 59 

<211> 267 

<212> DNA 

<213> Glycine max 



cctccacccc cagcnaagaa gggccaaagt 60 
gttttagacc ctgtggttac agctgttgca 120 
agcataagca aattcactga aataatttca 180 
agggacaaag atgctgccaa catcaagang 240 

254 



<400> 59 

gccaganaga cttgcttggt acaactacaa gcttcttgga ataagggtca ttaggaaggg 60 
tatccctcca cccccagcaa agaagggcca aagtggagtc ctatttgtat gcaaccacag 120 
gacagtttta gaccctgtgg ttacagctgt tgcattagga aggaaaatta gctgtgtcac 180 
atatagcata agcaaattca ctgaaataat tcaccaatca aagctgtggc actctctagg 240 
gagagggacc nagatgctgc cnacatc 267 



<210> 60 

<211> 261 

<212> DNA 

<213> Glycine max 



<400> 60 

gtaaccacag ggtctaaaac tgtgcggtgg ttactgcagt tgcacttgnc nagaaaaatt 60 
tgcttatgct atatgtgaca cagctaattc actgnaataa tttcaccaat taaagctgtg 120 
gcactctcaa ggganngaga gaaagatgct gccaatatcc ngagactact tgaggaaggg 180 
gacttggtga tttgccctga aggcacaact tgtagagagc cttcctcttg aggttcagtg 240 
cactatttgc tgaactcact g 261 

<210> 61 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 61 

caaggagctc acatgcagtg gagggaaatc agctattgaa gttgcaaact acattcaaag 60 

ggttcttgca gggactttgg gatttgagtg cacaaatttg actaggaaga gcaaatatgc 120 

catgcttgca ggcacagatg ggacagttcc atctaaggag aaggcttgan aagggagaga 180 

aattaagttc tcccttttga ttattctgta ttggtgccca atgtgtttcc aaaacactta 240 
gaattatgat agaaataa 25 8 



<210> 62 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 62 

attggcataa tcctctccat cctaagggtc tatctcaaca tccctctgcc agaaagactt 60 

gcttgntaca actacaagct tcttggaata agggtcatta ggaagggtat ccctccaccc 120 

ccagcaaaga agggccaaag tggagcctat ttgtatgcaa ccacaggaca gttttagacc 180 

ctgtggttac agctgttgca ttaggaagga aaattagctg tgtcacatat agcataagca 240 

aattcactga aataattt 258 



<210> 63 

<211> 239 

<212> DNA 

<213> Glycine max 



<400> 63 

cacttcacca ccacaccaca accctaccct ctctctctgt catggtcatt ggaggagcct 60 

tccctcgttt cgacccaatc accaaatgta gcacccaaga ccgctccaac cagaccatcg 120 

cctcggacct cgatggcacc ctccttgtct cccggagtgc cttcccctac tacttcctcg 180 

tcgccctcga agccggcagc gtcttccgag ccctccttct cttaaccttc gtccccttc 23 9 



<210> 64 
<211> 531 
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<212> DNA 

<213> Glycine max 



<400> 64 

ccgagaaccg 

ccagcgcatt 

ttgtcctcct 

ggccatcaag 

ggtcgcgtgc 

acctatacac 

ggagccttcc 

accatcgcct 

ttcctcgtcg 



gtctaaccaa 
tccttactac 
tgcctccgtc 
tccctgatct 
tcggtgctgc 
ttcaccacca 
ctcgtttcga 
cggacctcga 
ccctcgaagc 



accgtggcct 
atgctggtcg 
cctttcgtgt 
tcatcgcctt 
ccaagttcta 
caccacaacc 
cccaatcacc 
tggcaccctc 
cggcagcgtc 



cggacttgga 
ccatcgaagc 
attcacgtac 
cgcgggcctg 
cgccgacata 
ctaccctctc 
aaatgtagca 
cttgtctccc 
ttccgagccc 



cggcaccctc 
cggcagcttc 
atattcctct 
aaggtcaggg 
ttcttcagtt 
tctctgtcat 
cccaagaccg 
ggagtgcctt 
tccttctctt 



ctggtgtccc 
ctccgtggcc 
ccgagaccgc 
acgttgagat 
agctccccca 
ggtcattgga 
ctccaaccag 
cccctactac 
a 



60 

120 

180 

240 

300 

360 

420 

480 

531 



<210> 65 

<211> 256 

<212> DNA 

<213> Glycine max 



<400> 65 

acatattctt cagttagctc 
ctctctctct gtcatggtca 
tagcacccaa gaccgctcca 
ctcccggagt gccttcccct 
agccctcctt ctctta 



ccccaaccta tacacttcac caccacacca caaccctacc 60 
ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 
accagaccat cgcctcggac ctcgatggca ccctccttgt 180 
actacttcct cgtcgccctc gaagccggca gcgtcttccg 240 

256 



<210> 66 

<211> 260 

<212> DNA 

<213> Glycine max 



<400> 66 

ccatccaaca tattcttcag 
ccctaccctc tctctctgtc 
ccaaatgtag cacccaagac 
tccttgtctc ccggagtgcc 
tcttccgagc cctccttctc 



ttagctcccc caacctatac 
atggtcattg gaggagcctt 
cgctccaacc agactatcgc 
ttcccctact acttcctcgt 



acttcaccac cacaccacaa 60 
ccctcgtttc gacccaatca 120 
ctcggacctc gatggcaccc 180 
cgccctcgaa gccggcagcg 240 

260 



<210> 67 

<211> 248 

<212> DNA 

<213> Glycine max 

<400> 67 

caccaaccaa acctcactct ccctttctcc cctgaccctc tccctgccat ggtcatggga 60 
gcctttggcc acttcgaacc ggtctccaaa tgcagcaccg agaaccggtc taaccaaacc 120 
gtggcctcgg acttggacgg caccctcctg gtgtccccca gcgcatttcc ttactacatg 180 
ctgggcgcca tcgaagccgg cagcttcctc cgtggccttg tcctccttgc ctccgtccct 240 
ttcgtgta 248 

<210> 68 

<211> 283 

<212> DNA 

<213> Glycine max 



<400> 68 

ttcttcccca ccatcacacc 
ttccgccact tcgaaccggt 
gcctcggact tggacggcac 
gtcgccatcg aagccggcag 
gtgtacttca cgtacatatt 



aancaaacct cactctncct 
ttccaaatgc agcaccgaaa 
cctcctggtg tcccctagcg 
cttcctccgt ggccttgtcc 
cttctccgag accgcggcca 



ggccatggtc atgnnngcct 60 
accggtttaa ccaaaccgtg 120 
cctttcctta ctacatgctc 180 
tccttggatc cgtccctttc 240 
tea 283 



<210> 69 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 69 

ctcttcttcc ccaccatcnn accaaccaaa cctcactctc cctgaccatg gtcatgggag 60 
cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaccg 120 
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tggcctcgga cttggacggc accctcctgg tgtcccctag cgcctttcct tactacatgc 180 
tcgtcgccat cgaagccggc agcttcctcc gtggccttgt cctccttgga tccgtccctt 240 
tcgtgtactt cacgtaca 258 



<210> 70 

<211> 256 

<212> DNA 

<213> Glycine max 



<400> 70 

tgcaactaca acaacattca ttcattcaca gctgtcacgc cgtgaacgga aaatggcaac 60 
ggcgagacgc agtttcccgc ctatcaccga atgcaacgga acgacaccgt gcgagtctgt 12 0 
ggccgccgac ctcgacggta cgctcctcat ctcccgtagc tcgttcccgt acttcatgct 180 
cgtcgccgtc gaagccggca gcntcctccg cggcctcatc ctcctcctng ccantccgtt 240 
cgtcatcanc gcctac 25 6 

<210> 71 

<211> 259 

<212> DNA 

<213> Glycine max 

<400> 71 

cttccccacc atcacaccan ggcnaacctc antctccctt tctccacnga ccctctccct 60 
gccatngtca tgggancctt tggccacttc gaaccggtct ccaaatgcag caccgagaac 12 0 
cggnctaacc aaaccgtggc ctcggacttg gacggcaccc tcctggtgtc ccncagcgca 180 
tttccttact acatgctggc ngccatcgaa gccggcagct tcctccgtgg ccttgtcctc 240 
cttgcctccg tccctttcg 259 



<210> 72 

<211> 249 

<212> DNA 

<213> Glycine max 

<400> 72 

ccaacatatt cttcagttag ctcccccaac ctatacactt caccaccaca ccacaaccct 60 
accctctctc tctgtcatgg tcattggagg agccttccct cgtttcgacc caatcaccaa 120 
atgtagcacc caagaccgct ccaaccagac catcgcctcg gacctcgatg gcaccctnct 180 
tgtctcccgg agtgccttcc cctactactt cctcgtcgcc ctcgaagccg gcagcgtctt 240 
ncgagccct 

<210> 73 
<211> 257 
<212> DNA 
<213> Glycine max 



249 



<400> 73 

caaccctctt cttccccacc atcacaccaa 
cctctccctg ccatggtcat gggagccttt 
accgagaacc ggtctaacca aaccgtggcc 
cccagcgcat ntccttacta catgctggtc 
cttgtcctcc ttgcctg 

<210> 74 

<211> 255 

<212> DNA 

<213> Glycine max 



ncaaacctca ctctcccttt ctcccctgac 60 
ggccacttcg aaccggtctc caaatgcagc 12 0 
tcggacttgg acggcaccct cctggtgtcc 180 
gccatcgaag ccggcagctt cctccgtggc 240 

2 57 



<400> 74 

gccgaagacg tgcacccgga gagttggaga 
gtcacggcta gtcctagggt gatggtggag 
aaggtgcttg ggactgaact tgaggccacc 
aagcctggtg tgcttgttgg ggagcataag 
aattacctga cttgg 



gtgttcaact ctttcgggaa gcgttacatt 60 
ccgtttgtta aggcgtttct cggggctgac 120 
aaatcgggga cgttcactgg gtttgttaag 180 
aaagtggctc tggtgaagga gtttcagggt 240 

255 



<210> 75 

<211> 244 

<212> DNA 

<213> Glycine max 



<400> 75 
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caacaacatt cattcattca cagctgtcac 
gcagtttccc gcctatcacc gaatgcaacg 
acctcgacgg tacgctcctc atcncccgta 
tcgaagccgg cagcctcctc cgcggcctca 
gagg 



PCT/US99/22231 



gccgtgaacg gaaaatggca acggcgagac 60 
gaacgacacc gtgcgagtct gtggccgccg 12 0 
gctcgttccc gtacttcatg ctcgtcgccg 180 
tgcnttcctg ggtttanttt gagnacccct 240 

244 



<210> 76 

<211> 240 

<212> DNA 

<213> Glycine max 



<400> 76 

gctggctacc ctcttcttcc ccaccatcac 
ggtcatggga gcctttncgc cacttcgaac 
ttnaccanac cgtggcctcg gncttggacg 
cttactacat gctcgtcgcc atcgaagccg 



accaatcaaa cctcactcta ccctggccat 60 
cggtttccaa atgcagcacc gaanaccggt 120 
gcaccctcct ggtgtcccct agcgcctttc 180 
gcagcttcct ccgtggcttg tcctccttgg 240 



<210> 77 

<211> 263 

<212> DNA 

<213> Glycine max 



<400> 77 

gtttctcggg gctgacaagg tgcttgggac 
cactgggttt gttaagaagc ctggtgtgct 
gaaggagttt cagggtaatt tacctgactt 
cttcatgtca atttgcaagg aagggtacat 
aagaaacaag cttttaagtc caa 



tgaacttgag gccaccaaat cggggacgtt 60 
tgttggggag cataagaaag tggctctggt 120 
gggtctaggt gatagtaaaa gtgattatga 180 
ggtgccaaga actaagtgtg aaccactacc 240 

263 



<210> 78 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 78 

ggccacgaaa tcggggaggt tcactgggtt 
gcacaagaaa gtggctgttg tgaaggagtt 
agatagtaaa agtgattatg acttcatgtc 
gactaagtgt gaaccactac caagaaacaa 
taggtttgtt caaaggcc 



tgttaaggag cctggtgtgc ttgttgggga 60 
tcagggtaat ttacctgact tgggactagg 120 
aatttgcaag gaagggtaca tggtgccaag 180 
acttttaagt ccaattattt ntcatgaggg 240 

258 



<210> 79 

<211> 260 

<212> DNA 

<213> Glycine max 



<400> 79 

ctcttcttcc ccaccatcac 
ccctgccatg gtcatgggag 
gaaccggtct aaccaaaccg 
cgcatttcct tactacatgc 
tcctccttgc ctccgtccct 



accaancaaa cctcactctc 
cctttggcca cttcgaaccg 
tggcctcgga cttggacggc 
tggtcgccat cgaagccggc 



cctttctccc ctgaccctct 60 
gtctccaaat gcagcaccga 120 
accctcctgg tgtcccccag 180 
agcttcctcc gtgggccttg 240 

260 



<210> 80 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 80 

gggaacaaca acaaatggca ngaaccttat ctccttccaa cttggtgcat ttatccctgg 60 
atacccaatc cagcctgtaa ttgtacgcta tcctcatgtg cactttgacc aatcctgggg 120 
tcatgtntct ttgggaaagc ttatgttcag aatgttcact caatttcaca acttttttga 180 
ggtagaatat cttcctgtca tttatcccct ggatgataag gaaactgctg tancttntcg 240 
ggagaggact agccggg 2 57 

<210> 81 

<211> 272 

<212> DNA 

<213> Glycine max 
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<400> 81 

catacctttt gttggcacca ttattagagc aatgcaggtc atatatgtta acagattctt 60 
accatcatca aggaagcagg ctgttaggga aataaaggaa ctgaataaca gagaagggcc 120 
tcttgtgata aatttcctcg agtactatta tttcccgagg gaacaacaac taatggcagg 180 
aaccttatct ccttccaact tggtgcattt atccctggat acccaatcca gcctgtaatt 240 
atacgctatc ctcatgtaca ctttgaccaa tc 



272 



<210> 82 
<211> 245 
<212> DNA 
<213> Glycine max 

<400> 82 

gggcatttca catactagag ttcatcccag tgaaaagaaa gtgggaggct gatgaatcaa 
tcatgcgcca tatgctttct acattcaagg atccacaaga tcctctctgg cttgcgcttt 120 
tcccagaagg cactgatttc actgagcaaa agtgccttcg gagtcaaaaa tatgctgctg 180 
aacataagtt accggttctg aaaaatgttt tacttccaag gacaaagggg cttctgtgcc 240 
gcttg 



60 



245 



<210> 83 

<211> 268 

<212> DNA 

<213> Glycine max 

<400> 83 

cagtgtcctt cctttctgga caatgttttt ggtgttgacc cttcagaagt gcacctgcat 60 

gtgcggcgta ttccggtgga ggagattcca gcttctgaaa ccaaagctgc ttcttggtta 120 

atcgacacat tccagatcaa ggaccaattg ctttcggatt tcaagattca aggccatttc 180. 

cctaaccaac taaatgaaaa tgaaatttct agatttaaga gcctactctc ttttatggtg 240 

atagtttctt ttactgccat gtttattt 268 

<210> 84 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 84 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggccc.ttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 

attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 
catttatgat tgcacatatg cagtt 



240 
265 



<210> 85 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 85 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 

attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 240 
catttatgat tgcacatatg cagtt 

<210> 86 
<211> 301 
<212> DNA 
<213> Zea mays 



265 



<400> 86 

ctcgtcgtca 

gtctgcaacc 

gtcagctgcg 

gtcgcgctgt 

gcgacctggt 

9 



agggcacccc 
accgcaccgt 
tcacctacag 
cgcgggaggc 
catctgcccc 



gccgccgccg 
gctcgacccc 
catctccaag 
gacaaggacg 
gagggnaaca 



cccaagaagg gccacccggg cgtcctcttc 60 
gtcgaggtgg ccgtggcgct gcgccgcaag 12 0 
ttctccgagc tcatctcgcc catcaaggcc 180 
ccgagaacat ccgccgcctg ctggaggagg 2 40 
actgccgcga gcccttcctg ctgcgttcag 300 

3 01 



<210> 87 
<211> 309 
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<212> DNA 
<213> Zea mays 



<400> 87 

cgctcatgcg gtgtacatca acctgccgct 
gctcatgggc atcaggctcg tcgtcaaggg 
cccgggcgtc ctcttcgtct gcaaccaccg 
ggcgctgcgc cgcaaggtca gctgcgtcac 
ctcgcccatc aaggccgtcg cgctgtcggg 
gcctgctgg 

<210> 88 
<211> 304 
<212> DNA 
<213> Zea mays 



gcccgagcgc atcgtctact acacctacaa 60 
caccccgccg ccgccgccca agaagggcca 120 
caccgtgctc gaccccgtcg aggtggccgt 180 
ctacagcatc tccaagttct ccgagctcat 240 
gaggcgacaa ggacgccgag aacatccgcc 3 00 

3 09 



<400> 88 

tggctgtgca ggaggcctac ctggtgacgt 
agctgctgag cccgctgatt cgtgcacgac 
gtcgcgctcg tcaccttcct ctggatgccg 
tacatcaacc tgccgctgcc cgagcgcatc 
aggctcgtcg tcaagggcac cccgccgccg 
ttcg 



caaggaagta cagcccggtg cccaggaacc 60 
ggccgcctcg tgcagcgccc gacgccgctc 120 
ttcggcttcg cgctggcgct catgcgcgtg 180 
gtctactaca cctacaagct catgggcatc 240 
ccgcccaaga agggccaccc gggcgtcctc 3 00 

304 



<210> 89 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 89 

ggttcatcca cttgtgttgc tattngaccg 
caaagatttn gggctacggt gacaatctcc 
gagaatctgc ctccaaatag ctgtcctggt 
gatatttata cccttctaac tctagggagg 
tttatgttcc ctattatagg gtgggcaatg 
atggacagca gg 



gtaccgtagg agagcacagc actancatcg 60 
atgttctaca atcttnaggt cgaaggaatg 120 
gtctatgttg ctaaccatca gagcttcttg 180 
tgcttcaaat ttataagcaa gaccagcatc 240 
tatctcttgg gtgtgattcc tctgcggcgt 300 

312 



<210> 90 
<211> 264 
<212> DNA 
<213> Zea mays 



<400> 90 

ggtgctgtat ctgaaagaat ccatcgtgct 
ctcttcccct gagggcacaa ctacaaatgg 
ttttcttgca aaggcaccag ttcaaccagt 
tgcagcatgg gattccatgt caggggcacg 
aaattaccta gaggtggtcc gctt 



catcaacaga aaaatgcacc aatgatgcta 60 
ggattatctc cttccattca aaacaggtgc 120 
cattttgaga tatccttaca aaagatttaa 180 
tcatgtattt ctgctgctct gtcaatttgt 240 

264 



<210> 91 
<211> 212 
<212> DNA 
<213> Zea mays 



<400> 91 

aaatgtcttg gatgcatttt tgttcagcgg 
tcaggtgctg tatttgaaag aatccatcgt 
ctactcttcc ctgagggcac aactacaaat 
gcttttcttg caaaggcacc agttcaacca 



gagtcgaaaa caccagattt caaaggtgtt 60 
gctcatcaac agaaaaatgc accaatgatg 12 0 
ggggattatc tccttccatt caaaacaggt 180 
gt 212 



<210> 92 
<211> 267 
<212> DNA 
<213> Zea mays 



<400> 92 

gtctaaagaa atngaaaggc gtggggnaat 
tctttatcan atgtcagcct cttttcctag 
gcctctagtt ggtctcataa gcaaatgtct 
aatncanatt tcaaaggtgt ttaaggtgtg 



tgtgtctaat catgtntctt atgtggatat 60 
ttttgttgct aagagatcag tggntagatt 120 
tggatgcatt tttgttcagc gggagtnnaa 180 
gnatctgaaa gaatccatcg tgctcatcaa 240 
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cagaaaaatg caccaatgat gctactc 



267 



<210> 93 
<211> 152 
<212> DNA 
<213> Zea mays 

<400> 93 

ctacaaatgg ggattacctt cttccattta agactggagc ctttnttgca ggtgcaccag 60 
tgcagccagt cattttgaaa tacccttaca ggagatttag tccagcatgg gattcaatgg 120 
atggagcacg tcatgtgtta ttgctgctct gt 152 

<210> 94 
<211> 274 
<212> DNA 
<213> Zea mays 

<400> 94 

aaaatataaa ttaatatggt cttaatccca ccatataaat aacgttctct ttctgcaggg 60 
caatttagtt ctttctaata ttgggctggc agagaagcgc gtgtaccatg cagcactgac 120 
tggtagtagt ctacctggcg ctagacatga gaaagatgat tgaaagacgt tgcgtcgctt 180 
tttctgtaac agacagccga ggaacactta aaaatgtaac tgtgtgcgtg tttttatacc 240 
tgtaatgtgg cagtttattt gtttgaggag gctg 274 

<210> 95 
<211> 295 
<212> DNA 
<213> Zea mays 



<400> 95 

aatagctatc 

ttttacaatg 

cttacctcct 

ggacatgata 

caaccgtcct 



aagtacaata aaatatttgt 
cacttggtcc ggctgatgac 
caatatctga gggagggaga 
gctgctagag ctggactaaa 
agtcccaaac acactgaaga 



tgatgccttt tggaacagta agaagcaatc 60 
atcatgggct gttgtgtgtg atgtttggta 120 
gacggcaatt gcatttgctg agagagtaag 180 
gaaggttcct tgggatggct atctgaaaca 240 
gaacaacgca tattgccgat ctgtc 295 



<210> 96 
<211> 273 
<212> DNA 
<213> Zea mays 

<400> 96 

gngccatctc accggcggcn ggcctgcggc cggcaaccgg aggcgatggc gagctngtct 60 

gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctccc gcccgactcg 120 

ctcccgcagg aggcgcccag gaatctccat ctgcgcgatc tgcttgacat ctcgccggtg 180 

ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat 240 

tctccagaac catggaatgg aacatatatt tgt 273 

<210> 97 
<211> 127 
<212> DNA 
<213> Zea mays 

<400> 97 

ctcaatatct ganggaggga gagactgcaa ttgcgtttgc tgagagagta agggacatga 60 
tagcagctag agctggtctt aagaaggtcc cgtgggatgg ctatctgaag cacaaccgcc 120 
ctagtcc 127 

<210> 98 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 98 

gaaccgtacg cgcctcatta cgcccatcca cgtgctcgcc tctccccatc gcataatttt 60 

nctcggcggc gtcgccatct ccancggcng cnggcctgcn gccggcaacc ggaggcgatg 120 

gcgagctcgt ctgtggcggc ggacatggag ctggaccgcc ccaacctgga ggactacntc 180 

ccgcccgant cgctcccgca ggaggcgacc aggaatctcc atctgngcga tctgcttgan 240 

atctcgccgg tgctaaccga ggcagcgggt gccatagtcg atgatt 286 
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<210> 99 
<211> 308 
<212> DNA 
<213> Zea mays 



<400> 99 

cgccatctca tcggcggcgg 
tcgtctgtgg cgccggacat 
gactcgnncc cgcagaggcg 
cggtgctcac cgaggcagcg 
caaattctcc agagccatgg 
ataataag 



gcgtgcggcc ggcggcngag 
ggagctggac cgcccanacc 
ccccggaatc tccanctgcg 
ggtgccattg tcgatgactc 
aattggaaca tatatctgtt 



gcgaggngcg attggcgagc 60 
tggaggacta nctcccgccc 12 0 
cgatctgctg gacatcncgc 180 
cttcacacgg ngctttaagt 240 
ccccttatgt gctttggtgt 300 

308 



<210> 100 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 100 

cagaaactag angttagtca cagcatggca 
gagcaactat gcaatttaat gccatgctgt 
ctgtttggct actaggaaga ccgaggtaga 
canccaaatg acagagtaaa tgaaggtagg 
gttgttaaca caagttcctc tgggaaaatc 



ttaaattgtc atagtaaaca acancncact 60 
gactaacttc tagtttctgg cattaaatta 120 
gaagcaaata taagaatacc ctccaacgca 180 
gttcaccttc ttgaacatga ccgtatactg 240 
agagagggtt tt 282 



<210> 101 
<211> 282 
<212> DNA 
<213> Zea mays 

<400> 101 

ggcgcggctg gccgtggcgc tggtcctgcc 
acnggcatgt cgtggcggct caaagggtng 
gggcgctgnc agctgttcgt gtgcaacnac 
gtagcgtgga ccgggaaatg cgcgncgtgt 
tctcccccat ngncggaang tgcacctgan 



gtacagtact cgacgccgat cctggcngcg 60 
cgcccngngc ttgcnnngcc gtgctccggc 120 
cggacgctga tcgacccngt gtacgtgtcc 180 
nctacagnct gangcggntn tcggagctca 240 
accgggaacg gg 2 82 



<210> 102 
<211> 290 
<212> DNA 
<213> Zea mays 



<400> 102 

ggacgcggca ccatgcgcgc cgagctggcc 
accacgtgcc gggagccctt cctgctccgc 
aggatcgtgc ccgtggcgat gaactaccgc 
gggtggaaag ccatggaccc catcttcttc 
cgttcctgaa ccantccccg caaagcgacg 



agtggcgacg tggccgtgtg ccccgagggc 60 
ttctccaagc tcttcgcgga gctcagcgac 120 
gtggggctct tccacccgac gacggcgcgc 180 
ttcatgaacn gcggcccgtg tacgaggtga 240 
tgcgcggcgg ggaagagccc 2 90 



<210> 103 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 103 

acgaggtgac gttcctgaac cagctccccg 
ccgttgatgt agccaactac gttcagcgga 
ccaccctcac aaggaaggac aaatacacgg 
ccaagccggc ggcggcccgg aagccggctt 
tctgctccac taacaattac accttgccca 



cagaggcgac gtgcgcggcg gggaagagcc 60 
tactcgctgc cacgctcggg ttcgagtgca 12 0 
tgctcgccgg caacgacggc gtcctgaacg 180 
ggcagagccg cgtgaaggaa gtcctcgggt 240 
gatctggac 279 



<210> 104 
<211> 315 
<212> DNA 
<213> Zea mays 



<400> 104 

gcccgagcgc atcgtctact acacctacaa 
caccccgccg ccgccgccca agaagggcca 
caccgtgctc gaccccgtcg aggtggccgt 



gctcatgggc atcaggctcg tcgtcaaggg 60 
cccgggcgtc ctcttcgtct gcaaccaccg 120 
ggcgctgcgc cgcaangtca gctgcgtcac 180 
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3v 

tacagcatct ccaagttctc cgagctcatc tcgcccatca aggccgtagc agnaaagcag 240 

gtcgcaaatg gagcagnagc gagtcgatgg aagngaattg gcgactggtc atctgcncga 3 00 
aggnacactg cggag 315 

<210> 105 
<211> 314 
<212> DNA 
<213> Zea mays 

<400> 105 

cgagacaccg agcacgtact accagcaaga tggtggcgtc tcccagattc aagcccatcg 60 
aggagtgctg ctcggagggg cggtcggagc agacggtggc cgccgacctg gacggcacgc 120 
-tgctcatctc caggagcgcg ttcccctact acctcctcgt ggctctcgag gccggcagcg 180 
tcctccgcgc cgcgctgctg ctcctgtccg tgccgttcgt ctacgtcacc tacgccttct 240 
tctccgagtc gctggccatc agcacgctgg tgtacatctc cgtggcgggg ctcaaggtgc 300 
gcanatcgag atgg '. 314 

<210> 106 
<211> 291 
<212> DNA 
<213> Zea mays 



<400> 106 

ctctgggtct ggggccgaga caccgagcac 
gattcaagcc catcgaggag tgctgctcgg 
acctggacgg cacgctgctc atntccagga 
tcgaggccgg cagcgtcctc cgcgccgcgc 
tcacctacgc cttcttctcc gagtcgctgg 

<210> 107 
<211> 300 
<212> DNA 
<213> Zea mays 



gtactaccag caagatggtg gcgtctccca 60 
aggggcggtc ggagcagacg gtggccgccg 120 
gcgcgttccc ctactacctc ctcgtggctc 180 
tgctgctcct gtccgtgccg ttcgtctacg 240 
ccatcagcac gctggtgtac a 291 



<400> 107 

gcacgcagca gtacgacgtc tctcctctgg 
ccagcaagat ggtggcgtct cccagattca 
ggtcggagca gacggtggcc gccgacctgg 
tcccctacta cctcctcgtg gctctcgagg 
tcctgtccgt gccgttcgtc tacgtcacct 



gtctggggcc gagacaccga gcacgtacta 60 
agcccatcga ggagtgctgc tcggaggggc 120 
acggcacgct gctcatctcc aggagcgcgt 180 
ccggcagcgt cctccgcgcc gcgctgctgc 240 
acgccttctt ctccgagtcg ctggccatca 300 



<210> 108 
<211> 284 
<212> DNA 
<213> Zea mays 

<400> 108 

gnggccgaga caccgagcac gtactaccag 
antcgaggag tgctgctcgg aggggcggtc 
cacgctgctc atctccagga gcgcgttccc 
cagcgtcctc cgcgccgcgc tgctgctcct 
ttcttctccg agtcgctggc catcaanacg 



cangatggtg gcgtctccca gattcangcc 60 
ggagcagacg gtggccgccg acctggacgg 120 
ctacnacctc ctcgtggctc tcgaggccgg 180 
gtccgtgccg ttcgtctacg tcactacgcc 240 
ctggtgtaca tctc 284 



<210> 109 
<211> 280 
<212> DNA 
<213> Zea mays 



<400> 109 

ctcctctggg tctggggccg agacaccgag 
ccagattcaa gcccatcgag gagtgctgct 
ccgacctgga cggcacgctg ctcatctcca 
ctctcgaggc cggcagcgtc ctccgcgccg 
acgtcaccta cgcnttnttc tccgagtcgc 



cacgtactac cagcaagatg gtggcgtctc 60 
cggaggggcg gtcggagcag acggtggccg 120 
ggagcgcgtt ccnctactac ctcctcgtgg 180 
cgctgctgct cctgtccgtn ccgttcgtct 240 
tggccatcag 280 



<210> 110 
<211> 287 
<212> DNA 
<213> Zea mays 
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<400> 110 

cgtctctcct 

gtctcccaga 

ggccgccgac 

gtggctctcg 

gtctacgtca 



ctgggtctgg 
ttcaagccca 
ctggacggca 
aggccggcag 
ctacggcttc 



ggccgagaca ccgagcacgt actaccagca agatggtggc 60 
tcgaggagtg ctgctcggag gggcggtcgg agcagacggt 120 
gctgctcatc tccaggagcg cgttccccta ctacctcctc 180 
cgtcctccgc gccgcgctgc tgctcctgtc cgtgccgttc 240 
ttctccgagt cgctggccat cagcacg 2 87 



<210> 111 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 111 

cgcacagtta 

gcaagatggt 

cggagcagac 

cctactactc 

gtgcgttcgt 



cgacgtctct 
ggcgtctccc 
ggtggccgcc 
ctcgtgctct 
ctagtcacta 



cctctgggtc 
agattcaagc 
gacctggacg 
cgaggccggc 
cgcttttctc 



tggggccgag acaccgagca cgtactacca 60 
ccatcgagga gtgctgctcg gaggggcggt 120 
gcacgctgct catctccagg agcgcgttcc 180 
aggtcctccg cgccgcgctg tgctcctgtc 240 
gancgtggca ataana 2 86 



<210> 112 
<211> 323 
<212> DNA 
<213> Zea mays 



<400> 112 

gttattccct 

attcatacct 

tcaatcatgg 

taatttcatg 

tgcccttcat 

aacttcctat 



gaaggtacca 
ggctaccctg 
gggnatatat 
gaggtagagt 
tttgcggagg 
tcatatggtg 



caacaaatgg 
ttcaacctgt 
cgttattaaa 
accttcctgt 
ataccagcta 
att 



gagattcctg atttcgttcc aacatggtgc 60 
tgttgtccgt tatccacatg tgcactttga 120 
gctcatgttt aagatgttca cccaatttca 180 
tgtctaccct cctgagatca agcaagagaa 240 
tgctatggca cgtgccctca atgtcttgcc 300 

323 



<210> 113 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 113 

cgataaggcc 

tgtggcttca 

cagatgagga 

ggagtgatat 

gtacacttgc 

ggtttgcaga 



cttttcgaag 
gcttgtctgg 
aacttacaga 
tgattggctc 
tgtcatgaag 

gt 



agcttctacc 
gtggtggact 
tcaatgggta 
attggatgga 
aagtcatcca 



gtcggatcaa 
ggtgggcagg 
aagagcatgc 
tattggccca 
agttccttcc 



cagattcttg 
tgttaaggta 
actcatcata 
gcgttcaggg 
agttattggc 



gccgagctgc 60 
caactgcatg 120 
tcaaatcatc 180 
tgccttggaa 240 
tggtcaatgt 3 00 
312 



<210> 114 
<211> 279 
<212> DNA 
<213> Zea mays 

<400> 114 

agtggggtct ccaaaggttg aaagacttcc ctagaccatt ttggctagct ctttttgttg 60 

agggtactcg ctttactcca gcaaagcttc tcgcagctca ggagtatgcg gcttcccagg 120 

gcttaccagc tcctagaaat gtacttattc cacgtaccaa gggatttgta tctgccgtaa 180 

gtattatgcg agattttgtt ccagccattt acgatacaac tgtaatagtt cctaaagatt 240 

cccctcaacc aacaatgctg cggattttga aagggcaat 279 

<210> 115 

<211> 304 

<212> DNA 

<213> Zea mays 



<400> 115 

cgtcaacgcc 

ccgtcggatc 

ctggtgggca 

taaagagcat 

atattggccc 

agtt 



atccaggccg 
aacagattct 
ggtgttaagg 
gcactcatca 
agcgttcagg 



tcctatttgt 
tggccgagct 
tacaactgca 
tatcaaatca 
gtgccttgga 



gacgataagg 
gctgtggctt 
tgcagatgag 
tcggagtgat 
agtacattgc 



cccttttcga 
cagcttgtct 
gaaacttaca 
attgattggc 
tgtcatgaag 



agagcttcta 60 
gggtggtgga 120 
gatcaatggg 180 
tcatggatgg 240 
aagtcatcca 300 
304 
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<210> 116 
<211> 259 
<212> DNA 
<213> Zea mays 



<400> 116 ^ an 

cttcctcctg tccggcctca tcgtcaacgc catccaggcc gtcctatttg tgacgataag t>u 
gcccntttcg aagagcttct aacgtcggat caacagattc ntggccgagc tgctgtggct 120 
tcagcttgtc tgggtggtgg acnggtgggc aggtgttaag gtacaactgc atgcngatga 180 
ggaaacttac agatcnatgg gtanagagca tgcactcatc atatcaaatc atcggagtga 240 
tattgattgg cncattgga £ " J:i 

<210> 117 
<211> 235 
<212> DNA 
<213> Zea mays 



<400> 117 

attccacgta ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc 
atttatgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 
ttgaaagggc aatcatcagt gatacatgtc cgcatgaaac gtcatgcaat gagtgagatg 180 
ccaaaatcag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtggc aaagg 235 



60 
120 



<210> 118 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 118 

tgagatgcca 

ggatgcctta 

cggccgccca 

tgccatcgag 

tgccgcagga 



aaatcagatg atgacgtttc aaaatggt'gt aaagacattt ttgtgacaaa 60 
ctggacaaac atttggcaac aggcactttc gatgaggaga ttagacctat 120 
gtgaaatcat tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg 180 
ttcttcaagt ggacgcagct cctatcgaca tggagaggag tggcattcac 240 
tggcgctcgt gacaggggtc atgcacgtct tc 2 82 



<210> 119 
<211> 166 
<212> DNA 
<213> Zea mays 

<400> 119 ^ ar , 

ctggtgggca ggcgttaagg tacaactaca tgcggatgag gacacttacc gatcaatggg bu 
taaagagcat gcactcgtca tatcaaatca tcgaagtgat attgattggc ttattggatg 120 
gatattggcc cagcgctcag ggtgccttgg aagtacgctc gctgtc 166 



<210> 120 
<211> 234 
<212> DNA 
<213> Zea mays 



<400> 120 

agtcanccaa gntccttcca gtcattggct 
nggagaggag ctgggccaag gatgaaaaga 
acttccctag accatttngg ctagctcttn 
angnttntng aggnnncagn agnnncgggn 



ggtcaatgtg gtttgcagag tacctctttt 60 
cactaaagtg gggtctccaa aggttgaaag 120 
tttgtngagg gnantcgctt tactccagca 180 
ttcccanggg ttaacagncc cana 234 



<210> 121 
<211> 210 
<212> DNA 
<213> Zea mays 



<400> 121 

gtgagatgcn aaaatcagat gatgacgttt 
aaggatgcct tactggacaa acatttggca 
atcggccgcc cagtgaaatc atngctggtg 
ggtgccatcg agntcttcaa gtggacgcag 



caaaatggtg taaagacatt tttgtggaca 60 
acaggcactt tcgatgagga gattagacct 120 
accctgtnnt ggtcgtgcct gctgttgttt 180 

210 



<210> 122 
<211> 274 
<212> DNA 
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<213> Zea mays 



<400> 122 

acncccgaat ccgccgcgcg cgcnccgtcc 
cacagcagcc tatcgccgga gaaggaacgc 
tctgacccct ccgagatcgn aagcggcggc 
cccgctcggc ctcctcttcc tcctgtccgg 
atttgtgaca ataaggccct tttccaagag 



tcgtcgccgg cggaggcgcc cgcnaccgcc 60 
cgcggggagc ttttccacng ccatctcccg 120 
catggcgatc ccgctcgtgc tcgtcgtgct 180 
cctcatcgtc aacaccatcc aggccatcct 240 



<210> 123 
<211> 305 
<212> DNA 
<213> Zea mays 



<400> 123 

ttgcactgag gaaaggccat tagggatata 
agttgcctat ttttagctgg gcatttcaca 
gggagattga tgaagcaatt attcagaaca 
ctatctggtt ggcggttttt cctgaaggca 
gtcaagagta tgcttcagaa catggcttgc 
caagg 



tcaagtacat acataagagc agcttgatga 60 
tttttgagtt tatcccggta gaacggaaat 120 
agctatcaaa atttaagaac ccgagagatc 180 
cggattatac tgagaagaaa tgcatcatga 240 
ctatgctaga acatgtcctc cttccaaaga 300 



<210> 124 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 124 

ccagattttc tggacaatgt gtatggcgtt 
atggttcagc tccatcacat ccccacaaca 
aggtttaggc agaaggacca gctcctggca 
aaaggaactg aaaggagatc tgtcgacgcc 
tatgcttgac ggccnatctg gtttgtacct 



gatccttctg aagtccacat ccacgtcaga 60 
gaagacaaga taacagaatg gatggncgag 120 
gatttcttca tgaaggggca tttcctgatg 180 
gagtgcctgg caaactttct taaccagtag 240 
aaactcttt 279 



<210> 125 
<211> 219 
<212> DNA 
<213> Zea mays 



<400> 125 _ . cn 

agattttntg gacaatgtgt atggngttga tccttntgaa gtncacatcc acgtnagaat 60 
ggttcagctc catcacatcc ccacaacagn agacaagata acagaangga tggtagagag 1^0 
gtttaggcag aaggaccagc tcctggcaga tttcttcatg aaggggcact ttcctgatga 180 
aggaactgaa ggagatctgt cgacgccgaa gtgcctggc 2iy 



<210> 126 
<211> 293 
<212> DNA 
<213> Zea mays 



<400> 126 

taccatagat 

ngacaacgtc 

ctccgacata 

gcntnganna 

cgaacgaaag 



gctgtgtacg 
tacngcgtgg 
ncggcgtccg 
acgagctngc 
ggaaaaaggg 



acatcacgat 
ntccttcgga 
aaaaacgggg 
tgttcggggc 
gaaccgaagg 



cgcntacaaa 
agtccacatc 
tggctggcng 
tttctaccgc 
ggggaacctg 



caccggcngc ngacatttct 60 
cacatcanca gcatccaggt 120 
gntnngtgga gcggttcaag 180 
ggctggggcc aatttcnccc 240 
ttngaacggg ncc 293 



<210> 127 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 127 

Val Xaa Asn His Xaa Ser 
1 5 



<210> 128 
<211> 6 
<212> PRT 
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<213> conserved sequence 
<400> 128 

Val Thr Tyr Ser Xaa Ser 
1 5 



<210> 129 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 129 

Val Xaa Leu Thr Arg Xaa Arg 
1 5 



<210> 130 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 130 

Cys Pro Glu Gly Thr 
1 5 



<210> 131 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 131 

He Val Pro Val Ala 
1 5 



<210> 132 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 132 

Leu Xaa Xaa Gly Asp Leu Val 
1 5 



<210> 133 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 133 

Phe Xaa Xaa Gly Ala Phe 
1 5 



<210> 134 
<211> 6 
<212> PRT 

<213> Synthetic Oligonucleotide 
<400> 134 

Val Ala Asn Xaa Xaa Gin 
1 5 



<210> 135 
<211> 30 
<212> DNA 
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<213> Synthetic Oligonucleotide 
<400> 135 

ccatccgctt caagggaacg acacccatca 3 0 

<210> 136 
<211> 31 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 136 

tccctgtctt gcttgatgaa cttaaagctt g 31 

<210> 137 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 137 

acagcaggag tgtctgatga tggcagattc 3 0 

<210> 138 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 138 

actggagttc cagccaaaaa tgcacctgtc 30 

<210> 139 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 139 

gatacaccct tgaaatcagg cgattttgct 3 0 

<210> 140 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 140 

ttgcaaattc aattcctgtt tcaccgggcc 3 0 

<210> 141 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 141 

gttttctgct attccagaag gcgtcaacaa 30 

<210> 142 
<211> 32 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 142 

cattgaagat ccgtccgtga agttncctta cc 32 

<210> 143 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 143 

tcgagctgtg atcgatgatt ggctgtgaag 3 0 



<210> 144 
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<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 144 

gtctcttcaa aaacacacac acacgtctct 

<210> 145 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 145 

gtctcttcaa aaacacacac acacgtctct 

<210> 146 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 146 

gtagagagcc ttacttgctt cggtttagtc 

<210> 147 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 147 

acgtcatcgt acctgttgct attgactcac 

<210> 148 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 148 

acttttccat tgtcagggac tcctcgacac 

<210> 149 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 149 

acggtgtagg aagggaaagg attcaaaagg 

<210> 150 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 150 

gcgatgaact acagagtcgg attcttcctc 

<210> 151 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 151 

ccggtttacg agattacgtt cttgaaccag 

<210> 152 

<211> 30 

<212> DNA 

<213> Synthetic Oligonucleotide 



<400> 152 

caatggagac aaggctcgaa agtgctaacc 
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<210> 153 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 153 

attctctgaa catagttcgc cacggtcatg 

<210> 154 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 154 

gaaatccaac gccttcccaa tatcactctg 

<210> 155 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 155 

cttcaacttt ccatcaggat cttggcacgt 

<210> 156 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 156 

accacttgtt agagacctta cctgcttagg 

<210> 157 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 157 

tcctacctac accatccaat ttctcgaccc 

<210> 158 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 158 

ctgcgtcaag tgagcaactc agttcttgca 

<210> 159 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 159 

tgggaagcag cacgttgttc agtatcggaa 

<210> 160 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 160 

tagcctctgt gtaatctgtg ccctcgggga 

<210> 161 
<211> 1702 
<212> DNA 

<213> Simmondsia chinensis 
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<400> 161 

gaattctagc 

ctctaaaacc 

tagtataatt 

cggctgtgat 

ttcaggcaat 

acagggtgct 

gtgttaagat 

cacttgtgat 

agagatcagg 

cggtcatagg 

aggatgaaag 

ggttggctct 

aatatgctac 

gatttgtttc 

tggccatccc 

ccacggttca 

atgttgcaca 

1020 

atgtagatga 
1080 

tctttgtagc 
1140 

ggtcgtccct 
1200 

tcaccattct 
1260 

aggtagcccc 
1320 

agcagcacta 
1380 

attcaactgt 
1440 

aagagcctaa 
1500 

tatcagaatt 
1560 

atagtatctt 
1620 

tgagcattgt 
1680 

aattcgattc 
1702 



ctctctcctc 
ttaaaattgg 
atatctgggt 
tgtaccgctt 
ttgttttgtg 
ggtggaattg 
caagttgttc 
atcaaaccac 
ctgcctggga 
ttggtctatg 
cacattgaag 
tttcgtagaa 
ttcaatggga 
agccgtgagc 
taaatcttct 
tgtacacatc 
atggtgtcga 

cactttcgga 

agtctcttgg 

tctatcatca 

tatgcagatc 

aggaaagccc 

aaagtatata 

tcagaatgtc 

tgaacctaca 

cgtgattccg 

aaatttcttt 

ttgggtttat 

gagtgctctg 



44 

ctgcaattct 
aatggaatcg 
aatcttgaat 
ggcttgctct 
ctcgtgcggc 
ttgtggcttg 
acagatcctg 
agaagtgata 
agcacactgg 
tggttttctg 
ttaggtcttc 
ggaacacgat 
ttgccagttc 
catatgcgtt 
tcgcagccta 
aagcgccgct 
gacacattcg 

gatgagtatc 

gcattgattc 

tggaaggggg 

ttaatccaat 

aagaacatgg 

tggaccccaa 

aaatatagtt 

tacttggatc 

ggaccgatcc 

aatgatgtac 

atcgtggtaa 



PCT/US99/22231 



acttgctttc 
tttaaaaata 
ttgttggtga 
tcttcttctc 
cactgtcaaa 
agctgatatg 
atacctttcg 
ttgattggct 
ctgtcatgaa 
agtacctttt 
aacgcctcaa 
ttacccaagc 
ctagaaatac 
cgtttgtccc 
caatgctcag 
cgatgaaaga 
tcgcaaagga 

tgcaggacac 

tcatcctggg 

tcgccttctc 

tttctcaatc 

tatcagaacc 

ctaagaagat 

tgagaaacaa 

tgtcgtcgcc 

cggatcttag 

cggaattata 

atccttgtat 



tacgatcttt 
tgatcttttt 
ggccatgggg 
tggtctcttc 
gnntacatac 
gctcgtagat 
gctaatgggt 
tgttggatgg 
gaaatcatca 
tcttgagaga 
ggactaccct 
taaactttta 
tttgatccct 
ggccatatat 
acttttcaaa 
tctccctgaa 
tgcactcctg 



ccctctctct 
gtaattgaat 
atcccagctg 
atcaacttca 
agaaggatta 
tggtgggcaa 
aaagagcatg 
gtgttggccc 
aagtttctcc 
agctgggcca 
ctgcctttct 
gcagctcaag 
cgtactaagg 
gatgtaacgg 
ggccagccat 
gcagcagatg 
gacaagcata 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



tggccggcct ttgaaatctc 



aggtttgaaa 
agccgcatgc 
cgagcgctcg 
cacggaaacg 
tcagacgcaa 
aagatcaaga 
accgtctgct 
ccttctatgc 
atgttagtta 
tgtttataag 



ttcctacgat 
cttgtgctcg 
actcctgcta 
caacgacata 
gccacagttg 
ttagctgatg 
gctagctcgt 
atggattatg 
attaggggga 
atttgaagaa 



<210> 162 
<211> 387 
<212> PRT 

<213> Simmondsia chinensis 
<400> 162 

Met Gly lie Pro Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe 
1 5 10 15 

Phe Phe Ser Gly Leu Phe lie Asn Phe He Gin Ala He Cys Phe Val 
20 25 30 

Leu Val Arg Pro Leu Ser Lys Thr Tyr Arg Arg He Asn Arg Val Leu 
35 40 45 

Val Glu Leu Leu Trp Leu Glu Leu He Trp Leu Val Asp Trp Trp Ala 
50 55 60 

Ser Val Lys He Lys Leu Phe Thr Asp Pro Asp Thr Phe Arg Leu Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val He Ser Asn His Arg Ser Asp He Asp 
85 90 95 



Trp Leu Val Gly Trp Val Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 
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Thr Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val He Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Leu Gly Leu Gin Arg Leu Lys Asp Tyr 
145 ~ 150 155 160 

Pro Leu Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Gin Ala Lys Leu Leu Ala Ala Gin Glu Tyr Ala Thr Ser Met Gly Leu 
180 185 190 

Pro Val Pro Arg Asn Thr Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser His Met Arg Ser Phe Val Pro Ala He Tyr Asp Val Thr 
210 215 220 

Val Ala He Pro Lys Ser Ser Ser Gin Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Thr Val His Val His He Lys Arg Arg Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ala Ala Asp Asp Val Ala Gin Trp Cys Arg Asp 
260 265 270 

Thr Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His Asn Val Asp Asp 
275 280 285 

Thr Phe Gly Asp Glu Tyr Leu Gin Asp Thr Gly Arg Pro Leu Lys Ser 
290 295 300 

Leu Phe Val Ala Val Ser Trp Ala Leu He Leu He Leu Gly Gly Leu 
305 310 315 320 

Lys Phe Leu Arg Trp Ser Ser Leu Leu Ser Ser Trp Lys Gly Val Ala 
325 330 335 

Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gin He Leu 
340 345 350 

He Gin Phe Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Ala Pro 
355 360 365 

Gly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gin Arg His 
370 375 380 

Lys Gin His 
385 



<210> 163 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 163 

aagcttgcat gcgtcgacac aatggttcat gcgaccaagt cag 43 

<210> 164 
<211> 35 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 164 

ggtaccgtcg actcacttct tggtgttgtt gatag 

<210> 165 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 165 

ggatccgcgg ccgcacaatg acgagcttta ctacttccct teat 

<210> 166 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 166 

ggatcccctg caggttagag atccattgat tetgeaat 

<210> 167 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 167 

ggatccgcgg ccgcataatg gaatcagagc tcaaagat 

<210> 168 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 168 

ggatcccctg caggtcattc ttctttctga tggaaatc 

<210> 169 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 169 

ggatccgcgg ccgcacaatg actcgttcac aagatgtttc a 
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<210> 170 
<211> 38 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc 1 eo t i de 

<400> 170 

ggatcccctg caggtcactt ctcttccaat ctagccag 38 



<210> 171 
<211> 46 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 171 

ggatccgcgg ccgcacaatg tccggtaata agatctcgac tcttca 46 



<210> 172 
<211> 46 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 172 

ggatcccctg caggttattt tttcttgaca actccgttat taccgg 46 



<210> 173 
<211> 39 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 173 

atatccgcgg ccgcacaatg gttatggagc aagctggaa 3 9 



<210> 174 
<211> 38 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 174 

ggatcccctg caggtcaatg gagacaaggc tcgaaagt 3 8 



<210> 175 
<211> 42 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 175 
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ggatccgcgg ccgcacaatg tccgccaaga tttcaatatt cc 

<210> 176 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 176 

ggatcccctg caggttaatt tttcttaact actccatt 

<210> 177 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 177 

ggatccgcgg ccgcacaatg ggagctcagg agaaacggcg cc 

<210> 178 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 178 

ggatcccctg caggtcacgt cttctccttc ttcaccgg 

<210> 179 

<211> 44 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 179 

ggatccgcgg ccgcacaatg gcggatcctg atctgtcttc tcct 

<210> 1.80 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 180 

ggatcccctg caggttatgt tggggccaag tcaggtgcaa agat 

<210> 181 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 181 

ggatccgcgg ccgcaaaatg gaaaaaaaga gtgtaccaaa ttct 

<210> 182 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 182 

ggatcccctg caggttattt gtttactaat ttgagggaat tttttg 

<210> 183 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 183 

tcgacctgca ggaagcttaa ggatggtgat tgctgc 

<210> 184 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 184 

ggatccgcgg ccgcttactt ctccttctcc g 

<210> 185 
<211> 39 
<212> DNA 

<213> Artificial Sequence 



31 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 185 

ggatccgcgg ccgcacaatg tcttttaggg atgtcctag 

<210> 186 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 186 

ggatcccctg caggtcaatc atccttaccc tttggtttac c 

<210> 187 

<211> 60 

<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 187 

atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 

<210> 188 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 188 

tcaatcatcc ttaccctttg gtttaccctc tggaggcaga agattgtact gagagtgcac 60 

<210> 189 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 189 

ggatccgcgg ccgcacaatg aagcattccc aaaaataccg tagg 44 

<210> 190 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 190 

ggatcccctg caggtcaatg attttttttc atcacaaata c 41 

<210> 191 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 191 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg ctgtgcggta tttcacaccg 60 

<210> 192 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 192 

tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 

<210> 193 

<211> 43 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 193 

ggatccgcgg ccgcacaatg ggttttgttg atttcttcga aac 43 

<210> 194 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 194 

ggatcccctg caggttattt ggtctcaatt ttaatatttt tttgc 45 

<210> 195 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 195 

atgggttttg ttgatttctt cgaaacatat atggtcggtt ctgtgcggta tttcacaccg 60 

<210> 196 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 196 

ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 60 

<210> 197 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 197 

ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 

<210> 198 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 198 

ggatcccctg caggctactt cctcttttta cgttgatcgc tg 42 

<210> 199 
<211> 60 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol i gonuc 1 eo t i de 

<400> 199 

atggaaaagt acaccaattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60 

<210> 200 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 200 

ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 

<210> 201 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 201 

ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 

<210> 202 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 202 

ggatcccctg caggctacgc atctccttct ttcccttc 38 

<210> 203 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 203 

atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 

<210> 204 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 204 

ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 
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<210> 205 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 205 

ggatccgcgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 

<210> 206 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 206 

ggatcccctg caggtcattc tttcttttcg tgttctcttt tctg 44 

<210> 207 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 207 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 

<210> 208 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 208 

tcattctttc ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 

<210> 209 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 209 

ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagttcg 49 

<210> 210 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 210 
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ggatcccctg caggtcaaaa aataaaacaa taaagtttat aaactaacc 49 

<210> 211 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 211 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 

<210> 212 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 212 

tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 

<210> 213 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 213 

ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 

<210> 214 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 214 

ggatcccctg caggttaatg catctttttt acagatgaac c 41 

<210> 215 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 215 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 

<210> 216 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 216 

ttaatgcatc ttttttacag atgaaccttc gttatgggta agattgtact gagagtgcac 60 

<210> 217 
<211> 381 
<212> PRT 

<213> Saccharomyces sp . 

<220> 

<400> 217 

Met Ser Phe Arg Asp Val Leu Glu Arg Gly Asp Glu Phe Leu Glu Ala 
15 10 15 

Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 
20 25 30 

Leu Leu Thr Phe Gly Val Ser Lys Leu Leu Leu Phe Thr Cys Tyr Asn 
35 40 45 

Val Lys Leu Asn Gly Phe Glu Lys Leu Glu Thr Ala Leu Glu Arg Ser 
50 55 60 

Lys Arg Glu Asn Arg Gly Leu Met Thr Val Met Asn His Met Ser Met 
65 70 75 80 

Val Asp Asp Pro Leu Val Trp Ala Thr Leu Pro Tyr Lys Leu Phe Thr 
85 90 95 

Ser Leu Asp Asn lie Arg Trp Ser Leu Gly Ala His Asn lie Cys Phe 
100 105 110 

Gin Asn Lys Phe Leu Ala Asn Phe Phe Ser Leu Gly Gin Val Leu Ser 
115 120 125 

Thr Glu Arg Phe Gly Val Gly Pro Phe Gin Gly Ser He Asp Ala Ser 
130 135 140 

He Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Trp Thr Pro 
145 150 155 160 

His Ser Glu Val Ser Ser Ser Leu Lys Lys Ala Tyr Ser Pro Pro He 
165 170 175 

He Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 
180 185 190 

Leu Gin Leu Tyr Pro Pro Phe Glu Asn Ser Met Arg Tyr Phe Lys Trp 
195 200 205 

Gly He Thr Arg Met He Leu Glu Ala Thr Lys Pro Pro He Val Val 
210 215 220 

Pro He Phe Ala Thr Gly Phe Glu Lys He Ala Ser Glu Ala Val Thr 
225 230 235 240 

Asp Ser Met Phe Arg Gin He Leu Pro Arg Asn Phe Gly Ser Glu He 
245 250 255 

Asn Val Thr He Gly Asp Pro Leu Asn Asp Asp Leu He Asp Arg Tyr 
260 265 270 

Arg Lys Glu Trp Thr His Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 
275 280 285 

Pro Asn Asp Leu Ser Asp Glu Leu Lys Tyr Gly Lys Glu Ala Gin Asp 
290 295 300 

Leu Arg Ser Arg Leu Ala Ala Glu Leu Arg Ala His Val Ala Glu He 
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305 310 315 320 

Arg Asn Glu Val Arg Lys Leu Pro Arg Glu Asp Pro Arg Phe Lys Ser 
325 330 335 

Pro Ser Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro 
340 345 350 

Asp Val Lys Val lie Gly Glu Asn Trp Ala lie Arg Arg Met Gin Lys 
3 55 360 3 65 

Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 
370 375 380 

<210> 218 
<211> 396 
<212> PRT 

<213> Saccharomyces sp . 

<220> 

<400>- 218 

Met Lys His Ser Gin Lys Tyr Arg Arg Tyr Gly lie Tyr Glu Lys Thr 
1 5 10 15 

Gly Asn Pro Phe He Lys Gly Leu Gin Arg Leu Leu He Ala Cys Leu 
20 25 30 

Phe lie Ser Gly Ser Leu Ser He Val Val Phe Gin He Cys Leu Gin 
35 " 40 45 

Val Leu Leu Pro Trp Ser Lys He Arg Phe Gin Asn Gly He Asn Gin 
50 55 60 

Ser Lys Lys Ala Phe He Val Leu Leu Cys Met He Leu Asn Met Val 
65 70 75 80 

Ala Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 
85 90 95 

Asn Ser Ser Asn Ala Lys Pro Cys Phe Arg Phe Lys Asp Arg Ala He 
100 105 HO 

He He Ala Asn His Gin Met Tyr Ala Asp Trp He Tyr Leu Trp Trp 
115 120 125 

Leu Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr He He Leu Lys 
130 135 140 

Lys Ala Leu Gin Tyr He Pro Leu Leu Gly Phe Gly Met Arg Asn Phe 
145 150 155 160 

Lys Phe He Phe Leu Ser Arg Asn Trp Gin Lys Asp Glu Lys Ala Leu 
165 170 175 

Thr Asn Ser Leu Val Ser Met Asp Leu Asn Ala Arg Cys Lys Gly Pro 
180 185 190 

Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Lys Thr Asn Glu Ser He Ala 
195 200 205 

Ala Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 
210 215 220 

Thr Arg Glu Lys Ser Glu Ala Phe Cys Gin Arg Ala His Leu Asp His 
225 ~ ^ 230 235 240 

Val Gin Leu Arg His Leu Leu Leu Pro His Ser Lys Gly Leu Lys Phe 
245 250 255 
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Ala Val Glu Lys Leu 
260 

He Gly Tyr Ser Pro 
275 

Thr Leu Lys Lys He 
290 

Phe Tyr He Arg Glu 
305 

Glu Val Phe Phe Asn 
325 

Leu Leu Glu Asp Tyr 
340 

Asn Asp Asn Gin Ser 
355 

His Glu Thr Leu Thr 
370 

Phe Leu He Leu Val 
385 



Ala Pro Ser Leu Asp Ala 
265 

Ala Leu Arg Thr Glu Tyr 
280 

Phe Leu Met Gly Val Tyr 
295 

Phe Arg Val Asn Glu He 
310 315 

Trp Leu Leu Gly Val Trp 
330 

Tyr Asn Thr Gly Gin Phe 
345 

He Val Val Thr Thr Gin 
360 

Pro Arg He Leu Ser Tyr 
375 

Phe Val Met Lys Lys Asn 
390 395 



He Tyr Asp Val Thr 
270 

Val Gly Thr Lys Phe 
285 

Pro Glu Lys Val Asp 
300 

Pro Leu Gin Asp Asp 
320 

Lys Glu Lys Asp Gin 
335 

Lys Ser Asn Ala Lys 
3 50 

Thr Thr Gly Phe Gin 
365 

Tyr Gly Phe Phe Ala 
380 

His 



<210> 219 
<211> 479 
<212> PRT 

<213> Saccharorayces sp . 
<220> 

<400> 219 

Met Gly Phe Val Asp Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 
15 10 15 

Gin Phe Lys Gin Leu Asp He Ser Asp Trp Leu Ser Leu Thr Pro Arg 
20 25 30 

Leu Leu He Leu Phe Gly Tyr Phe Tyr Leu His Ser Phe Phe Thr Ala 
35 40 45 

He Asn Gin Phe Leu Gin Phe He Asn Thr Asn Ser Phe Cys Leu Arg 
50 55 60 

Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro He He Gly 
65 70 75 80 

Glu Tyr Lys He Arg Leu Leu Ser Arg Ala Leu Thr Tyr Ser Lys Leu 
85 90 95 

Lys He He Pro Thr Leu Asp Lys Val Leu Glu Ala He Glu He Trp 
100 105 HO 

Phe Gin Leu His Leu Val Glu Met Thr Phe Glu Lys Lys Lys Asn Val 
115 120 125 

Gin He Phe He Thr Glu Gly Ser Asp Asp Leu Asn Phe Phe Lys Asp 
130 135 140 

Ser Lys Phe Gin Thr Thr Leu Met He Cys Asn His Arg Ser Val Asn 
145 150 155 160 

Asp Tyr Thr Leu He Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 
165 170 175 
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Phe Tyr Thr Lys Trp Glu Phe Leu Gin Lys Leu Arg Lys Gly Glu Asp 
180 185 190 

Leu Ala Glu Trp Pro Gin Leu Lys Phe Leu Gly Trp Gly Lys Met Phe 
195 200 205 

Asn Phe Pro Arg Leu Asp Leu Leu Lys Asn lie Phe Phe Lys Asp Glu 
210 215 220 

Thr Leu Ala Leu Ser Ser Asn Glu Leu Arg Asp lie Leu Glu Arg Gin 
225 230 235 240 

Asn Asn Gin Ala lie Thr lie Phe Pro Glu Val Asn lie Met Ser Leu 
245 250 255 

Glu Leu Ser lie lie Gin Arg Lys Leu His Gin Asp Phe Pro Phe Val 
260 265 270 

lie Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 
275 280 285 

Leu Met Ala Ala Phe Ser Ser lie Lys Asn lie Lys Arg Lys Lys Asn 
290 295 300 

Arg Asn Asn lie lie Lys Glu Ala Arg Tyr Leu Phe His Arg Glu Leu 
305 310 315 320 

Asp Lys Leu Val His Lys Ser Met Lys Met Glu Ser Ser Lys Val Ser 
325 330 335 

Asp Lys Thr Thr Pro Pro Met lie Val Asp Asn Ser Tyr Leu Leu Thr 
340 345 350 

Lys Lys Glu Glu lie Ser Ser Gly Lys Pro Lys Val Val Arg lie Asn 
355 360 365 

Pro Tyr lie Tyr Asp Val Thr lie lie Tyr Tyr Arg Val Lys Tyr Thr 
370 375 380 

Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 
385 390 395 400 

Tyr Gin Leu Glu Gin lie Ser Pro Thr lie Phe Glu Met He Gin Pro 
405 410 415 

Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 
420 425 430 

Met Val Asn Val Lys Lys His Gin He Gin Pro Leu Leu Ala Tyr Asn 
435 440 445 

Asp Glu Ser Leu Glu Lys Trp Leu Glu Asn Arg Trp He Glu Lys Asp 
450 455 460 

Arg Leu He Glu Ser Leu Gin Lys Asn He Lys He Glu Thr Lys 
465 470 475 



<210> 220 
<211> 300 
<212> PRT 

<213> Saccharorayces sp . 
<400> 220 

Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 
15 10 15 

Phe Leu Pro Asn Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 
20 25 30 
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Leu Leu Gly lie Leu Gly Val Lys Thr lie lie Met Leu Pro Leu lie 
35 40 45 

Met Leu Tyr Leu Leu Thr Gly Gin Asn Asn Leu Leu Gly Leu lie Leu 
50 55 60 

Lys Phe Thr Phe Ser Trp Lys Glu Glu lie Thr Val Gin Gly lie Lys 
65 70 75 80 

Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Gin Lys Gly Lys Leu 
85 90 95 

Tyr lie Cys Asn Cys Thr Ser Pro Leu Asp Ala Phe Ser Val Val Leu 
100 105 110 

Leu Ala Gin Gly Pro Val Thr Leu Leu Val Pro Ser Asn Asp lie Val 
115 120 125 

Tyr Lys Val Ser lie Arg Glu Phe lie Asn Phe lie Leu Ala Gly Gly 
130 135 140 

Leu Asp lie Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gin Leu 
145 150 155 160 

Gly Asn Thr Val Asn Phe Met Phe Ala Glu Gly Thr Ser Cys Asn Gly 
165 170 175 

Lys Ser Val Leu Pro Phe Ser lie Thr Gly Lys Lys Leu Lys Glu Phe 
180 185 190 

lie Asp Pro Ser lie Thr Thr Met Asn Pro Ala Met Ala Lys Thr Lys 
195 200 205 

Lys Phe Glu Leu Gin Thr lie Gin lie Lys Thr Asn Lys Thr Ala lie 
210 215 220 

Thr Thr Leu Pro lie Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 
225 230 235 240 

Lys Gly lie Asn Val Lys Cys Lys lie Asn Glu Pro Gin Val Leu Ser 
245 250 255 

Asp Asn Leu Glu Glu Leu Arg Val Ala Leu Asn Gly Gly Asp Lys Tyr 
260 265 270 

Lys Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 
275 280 285 

Lys Glu Tyr lie Ser Asp Gin Arg Lys Lys Arg Lys 
290 295 300 

<210> 221 
<211> 759 
<212> PRT 

<213> Sac char omyces sp . 
<400> 221 

Met Pro Ala Pro Lys Leu Thr Glu Lys Phe Ala Ser Ser Lys Ser Thr 
15 10 15 

Gin Lys Thr Thr Asn Tyr Ser Ser lie Glu Ala Lys Ser Val Lys Thr 
20 25 30 

Ser Ala Asp Gin Ala Tyr lie Tyr Gin Glu Pro Ser Ala Thr Lys Lys 
35 40 45 



lie Leu Tyr Ser lie Ala Thr Trp Leu Leu Tyr Asn He Phe His Cys 
50 55 60 
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Phe Phe Arg Glu lie Arg Gly Arg Gly Ser Phe Lys Val Pro Gin Gin 
65 70 75 80 

Gly Pro Val lie Phe Val Ala Ala Pro His Ala Asn Gin Phe Val Asp 
85 90 95 

Pro Val He Leu Met Gly Glu Val Lys Lys Ser Val Asn Arg Arg Val 
100 105 110 

Ser Phe Leu He Ala Glu Ser Ser Leu Lys Gin Pro Pro He Gly Phe 
115 120 125 

Leu Ala Ser Phe Phe Met Ala He Gly Val Val Arg Pro Gin Asp Asn 
130 135 140 

Leu Lys Pro Ala Glu Gly Thr He Arg Val Asp Pro Thr Asp Tyr Lys 
145 150 155 160 

Arg Val He Gly His Asp Thr His Phe Leu Thr Asp Cys Met Pro Lys 
165 170 175 

Gly Leu He Gly Leu Pro Lys Ser Met Gly Phe Gly Glu He Gin Ser 
180 185 190 

He Glu Ser Asp Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Ala 
195 200 205 

Lys Pro Glu He Lys Thr Ala Leu Leu Thr Gly Thr Thr Tyr Lys Tyr 
210 215 220 

Ala Ala Lys Val Asp Gin Ser Cys Val Tyr His Arg Val Phe Glu His 
225 230 235 240 

Leu Ala His Asn Asn Cys He Gly He Phe Pro Glu Gly Gly Ser His 
245 250 255 

Asp Arg Thr Asn Leu Leu Pro Leu Lys Ala Gly Val Ala He Met Ala 
260 265 270 

Leu Gly Cys Met Asp Lys His Pro Asp Val Asn Val Lys He Val Pro 
275 280 285 

Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 
290 295 300 

Val Glu Phe Gly Asp Pro He Glu He Pro Lys Glu Leu Val Ala Lys 
305 310 315 320 

Tyr His Asn Pro Glu Thr Asn Arg Asp Ala Val Lys Glu Leu Leu Asp 
325 330 335 

Thr He Ser Lys Gly Leu Gin Ser Val Thr Val Thr Cys Ser Asp Tyr 
340 345 350 

Glu Thr Leu Met Val Val Gin Thr He Arg Arg Leu Tyr Met Thr Gin 
355 360 365 

Phe Ser Thr Lys Leu Pro Leu Pro Leu He Val Glu Met Asn Arg Arg 
370 375 380 

Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys He Ala Asp 
385 390 395 400 

Leu Thr Lys Asp He Met Ala Tyr Asn Ala Ala Leu Arg His Tyr Asn 
405 410 415 

Leu Pro Asp His Leu Val Glu Glu Ala Lys Val Asn Phe Ala Lys Asn 
420 425 430 
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Leu Gly Leu Val Phe Phe Arg Ser lie Gly Leu Cys lie Leu Phe Ser 
435 440 445 

Leu Ala Met Pro Gly He He Met Phe Ser Pro Val Phe He Leu Ala 
450 455 460 

Lys Arg He Ser Gin Glu Lys Ala Arg Thr Ala Leu Ser Lys Ser Thr 
465 470 475 480 

Val Lys He Lys Ala Asn Asp Val He Ala Thr Trp Lys He Leu He 
485 490 495 

Gly Met Gly Phe Ala Pro Leu Leu Tyr He Phe Trp Ser Val Leu He 
500 505 510 

Thr Tyr Tyr Leu Arg His Lys Pro Trp Asn Lys He Tyr Val Phe Ser 
515 520 525 

Gly Ser Tyr He Ser Cys Val He Val Thr Tyr Ser Ala Leu He Val 
530 535 540 

Gly Asp He Gly Met Asp Gly Phe Lys Ser Leu Arg Pro Leu Val Leu 
545 550 555 560 

Ser Leu Thr Ser Pro Lys Gly Leu Gin Lys Leu Gin Lys Asp Arg Arg 
565 570 575 

Asn Leu Ala Glu Arg He He Glu Val Val Asn Asn Phe Gly Ser Glu 
580 585 590 

Leu Phe Pro Asp Phe Asp Ser Ala Ala Leu Arg Glu Glu Phe Asp Val 
595 600 605 

He Asp Glu Glu Glu Glu Asp Arg Lys Thr Ser Glu Leu Asn Arg Arg 
610 615 620 

Lys Met Leu Arg Lys Gin Lys He Lys Arg Gin Glu Lys Asp Ser Ser 
625 630 635 640 

Ser Pro He He Ser Gin Arg Asp Asn His Asp Ala Tyr Glu His His 
645 650 655 

Asn Gin Asp Ser Asp Gly Val Ser Leu Val Asn Ser Asp Asn Ser Leu 
660 665 670 

Ser Asn He Pro Leu Phe Ser Ser Thr Phe His Arg Lys Ser Glu Ser 
675 680 685 

Ser Leu Ala Ser Thr Ser Val Ala Pro Ser Ser Ser Ser Glu Phe Glu 
690 695 700 

Val Glu Asn Glu He Leu Glu Glu Lys Asn Gly Leu Ala Ser Lys He 
705 710 715 720 

Ala Gin Ala Val Leu Asn Lys Arg He Gly Glu Asn Thr Ala Arg Glu 
725 730 735 

Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
740 745 750 

Glu Gly Lys Glu Gly Asp Ala 
755 

<210> 222 
<211> 743 
<212> PRT 

<213> Saccharomyces sp . 



<400> 222 
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Met Ser Ala Pro Ala Ala Asp His Asn Ala Ala Lys Pro lie Pro His 
1 5 10 15 

Val Pro Gin Ala Ser Arg Arg Tyr Lys Asn Ser Tyr Asn Gly Phe Val 
20 25 30 

Tyr Asn lie His Thr Trp Leu Tyr Asp Val Ser Val Phe Leu Phe Asn 
35 40 45 

lie Leu Phe Thr lie Phe Phe Arg Glu lie Lys Val Arg Gly Ala Tyr 
50 55 60 

Asn Val Pro Glu Val Gly Val Pro Thr lie Leu Val Cys Ala Pro His 
65 70 75 80 

Ala Asn Gin Phe lie Asp Pro Ala Leu Val Met Ser Gin Thr Arg Leu 
85 90 95 

Leu Lys Thr Ser Ala Gly Lys Ser Arg Ser Arg Met Pro Cys Phe Val 
100 105 110 

Thr Ala Glu Ser Ser Phe Lys Lys Arg Phe lie Ser Phe Phe Gly His 
115 120 125 

Ala Met Gly Gly He Pro Val Pro Arg He Gin Asp Asn Leu Lys Pro 
130 135 140 

Val Asp Glu Asn Leu Glu He Tyr Ala Pro Asp Leu Lys Asn His Pro 
145 150 155 160 

Glu He He Lys Gly Arg Ser Lys Asn Pro Gin Thr Thr Pro Val Asn 
165 170 175 

Phe Thr Lys Arg Phe Ser Ala Lys Ser Leu Leu Gly Leu Pro Asp Tyr 
180 185 190 

Leu Ser Asn Ala Gin He Lys Glu He Pro Asp Asp Glu Thr He He 
195 200 205 

Leu Ser Ser Pro Phe Arg Thr Ser Lys Ser Lys Val Val Glu Leu Leu 
210 215 220 

Thr Asn Gly Thr Asn Phe Lys Tyr Ala Glu Lys He Asp Asn Thr Glu 
225 230 235 240 

Thr Phe Gin Ser Val Phe Asp His Leu His Thr Lys Gly Cys Val Gly 
245 250 255 

He Phe Pro Glu Gly Gly Ser His Asp Arg Pro Ser Leu Leu Pro He 
260 265 270 

Lys Ala Gly Val Ala He Met Ala Leu Gly Ala Val Ala Ala Asp Pro 
275 280 285 

Thr Met Lys Val Ala Val Val Pro Cys Gly Leu His Tyr Phe His Arg 
290 295 300 

Asn Lys Phe Arg Ser Arg Ala Val Leu Glu Tyr Gly Glu Pro He Val 
305 310 315 320 

Val Asp Gly Lys Tyr Gly Glu Met Tyr Lys Asp Ser Pro Arg Glu Thr 
325 330 335 

Val Ser Lys Leu Leu Lys Lys He Thr Asn Ser Leu Phe Ser Val Thr 
340 345 350 

Glu Asn Ala Pro Asp Tyr Asp Thr Leu Met Val lie Gin Ala Ala Arg 
355 360 365 

Arg Leu Tyr -Gin Pro Val Lys Val Arg Leu Pro Leu Pro Ala He Val 
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370 375 380 

Glu lie Asn Arg Arg Leu Leu Phe Gly Tyr Ser Lys Phe Lys Asp Asp 
385 390 395 400 

Pro Arg lie lie His Leu Lys Lys Leu Val Tyr Asp Tyr Asn Arg Lys 
405 410 415 

Leu Asp Ser Val Gly Leu Lys Asp His Gin Val Met Gin Leu Lys Thr 
420 425 430 

Thr Lys Leu Glu Ala Leu Arg Cys Phe Val Thr Leu lie Val Arg Leu 
435 440 445 

lie Lys Phe Ser Val Phe Ala lie Leu Ser Leu Pro Gly Ser lie Leu 
450 455 460 

Phe Thr Pro lie Phe lie lie Cys Arg Val Tyr Ser Glu Lys Lys Ala 
465 470 475 480 

Lys Glu Gly Leu Lys Lys Ser Leu Val Lys lie Lys Gly Thr Asp Leu 
485 490 495 

Leu Ala Thr Trp Lys Leu lie Val Ala Leu lie Leu Ala Pro He Leu 
500 505 510 

Tyr Val Thr Tyr Ser He Leu Leu He He Leu Ala Arg Lys Gin His 
515 520 525 

Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe He Gin Phe Val 
530 535 540 

Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 
545 550 555 560 

Gly Glu He Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 
565 570 575 

Ser He Val Tyr Pro Gly Lys Lys He Glu Glu He Gin Thr Thr Arg 
580 585 590 

Lys Asn Leu Ser Leu Glu Leu Thr Ala Val Cys Asn Asp Leu Gly Pro 
595 600 605 

Leu Val Phe Pro Asp Tyr Asp Lys Leu Ala Thr Glu He Phe Ser Lys 
610 615 620 

Arg Asp Gly Tyr Asp Val Ser Ser Asp Ala Glu Ser Ser He Ser Arg 
625 630 635 640 

Met Ser Val Gin Ser Arg Ser Arg Ser Ser Ser He His Ser He Gly 
645 650 655 

Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 
660 665 670 

Thr Asp He Pro He Phe Ser Asp Ala Lys Gin Gly Gin Trp Lys Ser 
675 680 685 

Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pro 
690 695 700 

Ala He Val Gin Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 
705 710 715 720 

Arg Asn Thr Asn He Ser Ser Lys He Ala Ser Leu Val Arg Gin Lys 
725 730 735 

Arg Glu His Glu Lys Lys Glu 
740 
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<210> 223 
<211> 397 
<212> PRT 

<213> Saccharomyces sp. 
<400> 223 

Met Leu His Gin Lys lie Ala His Lys Val Arg Lys Val Val Val Pro 
15 10 15 

Gly lie Ser Leu Leu lie Phe Phe Gin Gly Cys Leu lie Leu Leu Phe 
20 25 30 

Leu Gin Leu Thr Tyr Lys Thr Leu Tyr Cys Arg Asn Asp lie Arg Lys 
35 40 4 5 

Gin lie Gly Leu Asn Lys Thr Lys Arg Leu Phe lie Val Leu Val Ser 
50 55 60 

Ser lie Leu His Val Val Ala Pro Ser Ala Val Arg lie Thr Thr Glu 
65 70 75 80 

Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys 
85 90 95 

Arg lie Leu Ser His Leu Lys Ser Asn Ser Val Ala lie Cys Asn His 
100 105 110 

Gin He Tyr Thr Asp Trp He Phe Leu Trp Trp Leu Ala Tyr Thr Ser 
115 120 125 

Asn Leu Gly Ala Asn Val Phe He He Leu Lys Lys Ser Leu Ala Ser 
130 135 140 

He Pro He Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe He Phe Met 
145 150 155 160 

Ser Arg Lys Trp Ala Gin Asp Lys He Thr Leu Ser Asn Ser Leu Ala 
165 170 175 

Gly Leu Asp Ser Asn Ala Arg Gly Ala Gly Ser Leu Ala Gly Lys Ser 
180 185 190 

Pro Glu Arg He Thr Glu Glu Gly Glu Ser He Trp Asn Pro Glu Val 
195 200 205 

He Asp Pro Lys Gin He His Trp Pro Tyr Asn Leu He Leu Phe Pro 
210 215 220 

Glu Gly Thr Asn Leu Ser Ala Asp Thr Arg Gin Lys Ser Ala Lys Tyr 
225 230 235 240 

Ala Ala Lys He Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro His 
245 250 255 

Ser Thr Gly Leu Arg Tyr Ser Leu Gin Lys Leu Lys Pro Ser He Glu 
260 ~ ~ 265 270 

Ser Leu Tyr Asp He Thr He Gly Tyr Ser Gly Val Lys Gin Glu Glu 
275 280 285 

Tyr Gly Glu Leu He Tyr Gly Leu Lys Ser He Phe Leu Glu Gly Lys 
290 295 300 

Tyr Pro Lys Leu Val Asp He His He Arg Ala Phe Asp Val Lys Asp 
305 ~ 310 315 320 

He Pro Leu Glu Asp Glu Asn Glu Phe Ser Glu Trp Leu Tyr Lys He 
325 330 335 
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Trp Ser Glu Lys Asp Ala Leu Met Glu Arg Tyr Tyr Ser Thr Gly Ser 
340 345 350 

Phe Val Ser Asp Pro Glu Thr Asn His Ser Val Thr Asp Ser Phe Lys 
355 360 365 

lie Asn Arg He Glu Leu Thr Glu Val Leu He Leu Pro Thr Leu Thr 
370 375 380 

He He Trp Leu Val Tyr Lys Leu Tyr Cys Phe He Phe 
385 390 395 



<210> 224 
<211> 303 
<212> PRT 

<213> Saccharomyces sp. 
<400> 224 

Met Ser Val He Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Val 
15 10 15 

Val Leu Ala Leu Ala Gly Cys Gly Phe Tyr Gly Val He Ala Ser He 
20 25 30 

Leu Cys Thr Leu He Gly Lys Gin His Leu Ala Gin Trp He Thr Ala 
35 40 45 

Arg Cys Phe Tyr His Val Met Lys Leu Met Leu Gly Leu Asp Val Lys 
50 55 60 

Val Val Gly Glu Glu Asn Leu Ala Lys Lys Pro Tyr He Met He Ala 
65 70 75 80 

Asn His Gin Ser Thr Leu Asp He Phe Met Leu Gly Arg He Phe Pro 
85 90 95 

Pro Gly Cys Thr Val Thr Ala Lys Lys Ser Leu Lys Tyr Val Pro Phe 
100 105 HO 

Leu Gly Trp Phe Met Ala Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 
115 120 125 

Lys Arg Gin Glu Ala He Asp Thr Leu Asn Lys Gly Leu Glu Asn Val 
130 135 140 

Lys Lys Asn Lys Arg Ala Leu Trp Val Phe Pro Glu Gly Thr Arg Ser 
145 150 155 160 

Tyr Thr Ser Glu Leu Thr Met Leu Pro Phe Lys Lys Gly Ala Phe His 
165 170 175 

Leu Ala Gin Gin Gly Lys He Pro He Val Pro Val Val Val Ser Asn 
180 185 190 

Thr Ser Thr Leu Val Ser Pro Lys Tyr Gly Val Phe Asn Arg Gly Cys 
195 200 205 

Met He Val Arg He Leu Lys Pro He Ser Thr Glu Asn Leu Thr Lys 
210 215 220 

Asp Lys He Gly Glu Phe Ala Glu Lys Val Arg Asp Gin Met Val Asp 
225 230 235 240 

Thr Leu Lys Glu He Gly Tyr Ser Pro Ala He Asn Asp Thr Thr Leu 
245 250 255 

Pro Pro Gin Ala He Glu Tyr Ala Ala Leu Gin His Asp Lys Lys Val 
260 265 270 
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Asn Lys Lys lie Lys Asn Glu Pro Val Pro Ser Val Ser lie Ser Asn 
275 280 285 

Asp Val Asn Thr His Asn Glu Gly Ser Ser Val Lys Lys Met His 
290 295 300 



<210> 225 
<211> 1146 
<212> DNA 

<213> Sac char omyces sp. 



<400> 225 

atgtctttta 

agcccccttt 

ctgcttcttt 

ttggaacgtt 

gtcgatgatc 

ataagatggt 

ttctcacttg 

atagatgctt 

cactctgagg 

ccatcttggg 

aattcgatga 

cccattgtag 

gattcaatgt 

ggggatcctt 

gaaaaatact 

gaggcgcaag 

agaaatgaag 

1020 

aagcggttca 
1080 

tgggcaataa 
1140 
gattga 
1146 



gggatgtcct 
ggagatttct 
tcacatgcta 
ccaaaaggga 
cgttagtttg 
ctttgggtgc 
gccaagtcct 
caataagatt 
tctcttcttc 
tccatgttta 
ggtattttaa 
taccaatatt 
ttagacaaat 
taaatgatga 
atgatcccaa 
atttaagaag 
ttcgcaaatt 



agaaagagga 
ttcatacagt 
taatgtcaaa 
aaatagaggc 
ggcaacacta 
acataatatt 
ttcaacagaa 
gttaagccct 
gctaaaaaaa 
tccagaagga 
atggggtatt 
tgctacaggg 
tctaccaaga 
tttaatcgac 
aaatcctaac 
cagattagcc 
accacgcgaa 



gatgaatttt 
acatcattac 
ttgaatggtt 
cttatgacgg 
ccatataagt 
tgctttcaaa 
agatttgggg 
gacgacactt 
gcctactccc 
tttgtactac 
accagaatga 
tttgaaaaaa 
aactttggct 
aggtatagaa 
gacctctctg 
gctgaactga 
gaccctaggt 



tagaagccta 
tgaccttcgg 
ttgaaaaatt 
tcatgaacca 
tatttacgtc 
ataaatttct 
tgggcccatt 
tagacttgga 
cgcccataat 
aattatatcc 
tcctagaagc 
tagcatccga 
ctgaaataaa 
aagaatggac 
acgaattgaa 
gagcccatgt 
tcaaatcccc 



tcccagaaga 
tgtatcaaaa 
agaaactgcc 
tatgagtatg 
tttggacaac 
ggccaacttt 
tcaaggttct 
atggacccct 
aaggtcgaag 
gccttttgaa 
aacaaagccg 
agcagtcaca 
tgttaccata 
acatttggtt 
atatggtaaa 
tgctgaaatt 
ctcatggtgg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



acaccacgga aggtaaatcg gacccagatg ttaaagtcat tggcgaaaat 
ggaggatgca aaagtttctg cctccagagg gtaaaccaaa gggtaaggat 



<210> 226 
<211> 1191 
<212> DNA 

<213> Sac char omyces sp . 



<400> 226 

atgaagcatt 

ataaaagggt 

gtcgtttttc 

ggtataaatc 

gctccctctt 

gccaagccat 

gcagactgga 

atcatcctga 

aagtttatat 

gtttctatgg 

tccaagacaa 

ctaagcctca 

gtccaattaa 

ctagctccta 

acggaatacg 

gagaaagtag 

gaagtttt tt 

1020 

tactacaaca 
1080 

acgacacaaa 
1140 

gggttcttcg 
1191 



cccaaaaata 
tgcaaaggct 
agatctgtct 
aaagtaagaa 
ctttgaatgt 
gctttagatt 
tttatctctg 
agaaagctct 
ttttaagtag 
acttaaacgc 
atgaatccat 
agacaagaga 
gacatttgtt 
gtttagatgc 
tcggcaccaa 
atttttatat 
tcaattggtt 



ccgtaggtat 
gcttatcgct 
acaggtgctt 
ggcttttatc 
cacttttgaa 
taaagacagg 
gtggctttcc 
gcagtacata 
gaactggcaa 
gaggtgcaag 
tgccgcttat 
aaaaagcgag 
attaccgcac 
tatctacgat 
attcaccttg 
tagggaattt 
actgggcgtg 



ggaatttatg 
tgcttgttca 
ctcccttgga 
gttttattat 
acatcgcggc 
gctataataa 
tttgtttcaa 
ccattactgg 
aaggatgaga 
gggcccctta 
aatttaatca 
gcattctgtc 
tctaaaggct 
gtcactattg 
aagaaaatat 
agagttaatg 
tggaaagaaa 



aaaagactgg 
tttcaggctc 
gcaagattag 
gcatgatctt 
cattgaagaa 
ttgcaaatca 
atttgggtgg 
gatttggcat 
aagctttaac 
caaattataa 
tgttccctga 
aaagagcaca 
tgaagtttgc 
gatattctcc 
tcttaatggg 
agatcccttt 
aagatcaact 



taatcccttt 
gctgagtatt 
atttcaaaat 
gaacatggtg 
ctcttctaac 
tcaaatgtat 
taacgtttat 
gcgaaatttt 
aaatagtttg 
gagttgttat 
gggtacaaat 
tttggaccat 
agtagaaaaa 
cgccttgaga 
tgtctatccg 
gcaagatgac 
gctagaagac 



caggccaatt taaaagtaat 
cgactggatt tcagcacgaa 
cttttcttat tcttgtattt gtgatgaaaa aaaatcattg a 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



gctaaaaatg acaaccaatc catcgttgtt 
acattgacac cccgtatcct ttcatattac 
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<210> 227 
<211> 1440 
<212> DNA 

<213> Saccharomyces sp . 



<400> 227 

atgggttttg 

ttagatattt 

taccttcatt 

ttctgtctta 

gagtacaaaa 

actttagaca 

accttcgaaa 

ttttttaaag 

gactacacat 

tgggaatttc 

tttcttggtt 

ttcaaagatg 

aacaatcaag 

attcaaagaa 

ccaagattta 

agaaagaaaa 

gacaaattag 

1020 

ccgcccatga 
1080 

aagcccaagg 
1140 

gtcaaatata 
1200 

tatcaattag 
1260 

gaaaacaaca 
1320 

attcaaccat 
1380 

atagaaaaag 
1440 



ttgatttctt 
ctgattggtt 
ctttttttac 
gactgcattt 
ttcggctgct 
aggtgctgga 
aaaaaaaaaa 
atagcaaatt 
tgattaatta 
tacaaaagct 
ggggaaaaat 
aaacactcgc 
ctattactat 
aattacacca 
aaaactttac 
accgtaacaa 
ttcacaagag 

tcgtagataa 

tggtacgaat 

ctgatagtgg 

agcaaatatc 

taaaggataa 

tactcgcata 

atagattaat 



cgaaacatat 
gagtctgacc 
tgcaatcaat 
actatatgac 
ctcgagggca 
ggcgattgaa 
cgtccaaatt 
ccaaaccaca 
cctttttctc 
gaggaagggg 
gtttaacttt 
actctcatcg 
ttttcccgaa 
agattttccc 
cactttgatg 
tataatcaaa 
catgaaaatg 

ttcatactta 

caatccatac 

gcatgatcat 

tccgacaatc 

ggaccccatt 

caatgatgag 

cgagtccttg 



atggtcggtt 
ccaaggttgc 
caattcctac 
agattttggt 
ctgacatata 
atttggtttc 
ttcataaccg 
ttaatgatat 
aaaagttgtc 
gaagatctag 
cctcgattgg 
aatgagttaa 
gtcaatatca 
tttgttataa 
gctgcttttt 
gaggcccgat 
gagtcttcca 

cttacaaaaa 

atatatgatg 

accaacggag 

tttgagatga 

gttgtgatgg 

agtttagaaa 

caaaaaaata 



ctagggtcca 
ttattctttt 
agttcattaa 
cgcatgtgcc 
gtaaactgaa 
agctacattt 
agggaagtga 
gtaatcatcg 
ccaccaagtt 
ctgaatggcc 
atctactaaa 
gagatatttt 
tgagtttgga 
acttctataa 
catcaattaa 
acctgtttca 
aggtatccga 

aggaagaaat 

tcaccataat 

atttgagact 

ttcaaccaga 

taaatgtaaa 

agtggcttga 

ttaaaattga 



gttcaaacag 
tggctatttt 
cacgaattcc 
cataataggt 
aataatacca 
agttgaaatg 
tgacctaaac 
atcagtgaat 
ttatactaaa 
tcagttaaaa 
gaacatattc 
agaaagacaa 
actatcaatt 
tttattatac 
aaacatcaaa 
cagagaactt 
taagacgacg 

cagcagcggc 

ttattaccga 

tcataaaggt 

aatggagtct 

aaagcatcaa 

aaataggtgg 

gaccaaataa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



<210> 228 
<211> 903 
<212> DNA 

<213> Saccharomyces sp . 



<400> 228 

atggaaaagt 

acaatcagga 

accattataa 

ggtttgatat 

aaacgtgacg 

tgtacctcac 

ttggtcccat 

ctcgccggtg 

ggcaataccg 

ccgtttagta 

aaccccgcaa 

aaaactgcca 

aagggcatta 

gaattacgcg 

gttgaatcta 

tag 



acaccaattg 
aacctagtaa 
tgctaccatt 
tgaagtttac 
taaggaaatc 
ctttagatgc 
ccaatgacat 
ggttagatat 
tgaattttat 
taaccgggaa 
tggccaaaac 
tcaccacatt 
atgttaaatg 
ttgcattaaa 
agaggaattt 



gagagacaat 
ggtgatgaca 
gattatgctg 
attcagttgg 
caagcattat 
tttttcagtg 
tgtatacaaa 
aaaactctat 
gtttgctgag 
aaaacttaaa 
taaaaaattt 
gcccatctcc 
caagatcaac 
cggtggcgac 
tgtgaaggaa 



ggtacgggaa 
gcgtgtttgt 
taccttctaa 
aaagaggaaa 
ccacagaagg 
gtgttattag 
gtttccataa 
ggccacgagg 
ggtacctcat 
gaattcatag 
gaattgcaga 
aatatggagt 
gagccacaag 
aaatataaac 
tatatcagcg 



tagctccatt 
tgggtatcct 
ctggccagaa 
ttaccgtgca 
gcaagcttta 
ctcaagggcc 
gagaattcat 
tagcagagct 
gtaatggtaa 
acccttcaat 
ccatccaaat 
atttatctag 
tactctcgga 
tagtctcacg 
atcaacgtaa 



tctaccaaac 
aggggtgaaa 
caacttactg 
aggaatcaag 
tatttgcaat 
tgttacgttg 
caacttcatc 
atctcaattg 
aagcgtctta 
aaccacaatg 
caaaactaat 
atttctgaac 
taatttagag 
gaagttagat 
aaagaggaag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

903 



<210> 229 
<211> 2280 
<212> DNA 

<213> Saccharomyces sp . 
<400> 229 

atgcctgcac caaaactcac ggagaaattt gcctcttcca agagcacaca gaaaactacg 60 
aattacagtt ccatcgaggc caaaagcgtc aagacgtcgg ctgatcaggc atacatctac 120 
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caagagccta 
atcttccact 
ggaccggtga 
atgggcgagg 
ttaaagcaac 
ccgcaggata 
agagttatcg 
ttacccaaat 
ctaagaaaag 
acttataaat 
ttggcccata 
ttgttgcccc 
gacgtcaatg 
tcgagagcgg 
taccacaacc 
1020 

ggtttacaat 
1080 

ataagaagac 
1140 

atgaacagaa 
1200 

ttgaccaaag 
1260 

cttgtggagg 
1320 

atcgggctct 
1380 

ttcatattag 
1440 

gttaaaataa 
1500 

gcgcccttgc 
1560 

tggaataaaa 
1620 

gccttaatcg 
1680 

tctcttacat 
1740 

agaataatcg 
1800 

gccctacgtg 
1860 

ttgaatcgca 
1920 

tcacctatca 
1980 

gatggcgtct 
2040 

acttttcatc 
2100 

tccgaatttg 
2160 

gcacaggccg 
2220 

gaagaagagg 
2280 



gcgctaccaa 
gcttctttag 
tctttgttgc 
tgaagaaatc 
cccccatagg 
atttgaaacc 
gccacgacac 
caatgggatt 
agttcaaaat 
atgccgctaa 
acaactgcat 
tgaaagcagg 
ttaagattgt 
ttgttgaatt 
cggaaacgaa 

ccgttaccgt 

tatatatgac 

gaatggtcaa 

atataatggc 

aggcaaaggt 

gcatcctctt 

ccaagagaat 

aggctaacga 

tttacatctt 

tatatgtttt 

tgggtgatat 

ctccaaaggg 

aagttgtaaa 

aagaattcga 

ggaaaatgct 

tcagccaacg 

cattggtcaa 

gtaagtcaga 

aggtagaaaa 

tcttaaacaa 



gaagatactt 
agaaatcaga 
ggctccgcat 
tgtcaacaga 
gtttttggct 
ggcagaaggt 
gcatttcttg 
tggagaaatc 
ggccaaacca 
agtcgaccaa 
tgggatcttt 
tgtggcgatt 
tccctgcggt 
cggtgacccc 
cagagatgca 

tacatgttct 

acaatttagc 

aggttacgaa 

atataatgcc 

aaatttcgca 

ttcgttagcc 

ttctcaagaa 

tgtcattgcc 

ttggtccgtt 

ttccgggtct 

tggtatggat 

cttgcaaaag 

taactttgga 

cgtcatcgat 

aagaaaacag 

tgacaaccac 

tagtgacaat 

gtcttcctta 

cgaaatcttg 

gagaattggt 



tactccatcg 
ggccggggca 
gctaaccagt 
cgtgtgtcct 
agtttcttca 
actatccgcg 
actgattgta 
cagtccatag 
gagattaaaa 
tcttgcgttt 
cctgaaggtg 
atggctcttg 
atgaattatt 
attgaaatac 
gtgaaagaat 

gattatgaaa 

accaagttac 

ttctatagaa 

gccttgagac 

aaaaacctcg 

atgccaggta 

aaggcccgta 

acgtggaaaa 

ttaatcactt 

tacatctcgt 

ggtttcaaat 

ctacaaaagg 

agcgaattat 

gaagaggaag 

aaaataaaaa 

gatgcctatg 

tccctctcta 



ccacatggct 
gtttcaaggt 
tcgtcgaccc 
tcttgattgc 
tggccatagg 
tagatccaac 
tgccaaaggg 
aaagtgacac 
ctgctttact 
accatagagt 
ggtcccacga 
gttgcatgga 
tccatccaca 
cgaaggaact 
tattagatac 

ctttgatggt 

cgttgccctt 

acgatcctaa 

actataatct 

gacttgtttt 

tcattatgtt 

ccgctttgtc 

tcttgattgg 

attacctcag 

gtgttatagt 

ctttgagacc 

atcgtagaaa 

tccccgattt 

aagatcgaaa 

gacaagaaaa 

aacaccataa 

acattccatt 



gttgtacaac 
accgcaacag 
tgtaatcctt 
ggagagctca 
cgtggtaagg 
agactacaag 
tctcatcggg 
gagtttgacc 
caccggcact 
ttttgagcat 
cagaacaaac 
taagcatcct 
taagttcagg 
agtcgccaag 
catatcgaag 

ggttcaaacg 

gattgtggaa 

aatagcggac 

tcctgatcac 

ttttagatcc 

ctcacctgtc 

caagtctaca 

gatgggattt 

acataaacca 

cacgtattcc 

actggtttta 

tctggcagaa 

cgatagtgcc 

aacctcagaa 

agattcgtca 

ccaagattcc 

attctcttct 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



gcttcgacat 
gaggaaaaaa 
gaaaatactg 
aagaagaaga agaggaagaa gaagaagaag 



ccgttgcacc ttcttcttcc 
atggattagc aagtaaaatc 
ccagggaaga ggaagaggaa 
ggaaagaagg agatgcgtag 



<210> 230 
<211> 2232 
<212> DNA 

<213> Saccharomyces sp . 
<400> 230 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ttcctcatgt acctcaagcg 60 
tcccgacggt acaaaaattc atacaatgga ttcgtataca atatacatac atggctgtat 120 
gatgtgtctg tatttctgtt taatattttg ttcactattt tcttcagaga aattaaggta 180 
cgtggtgcat ataacgttcc cgaagttggg gtgccaacca tccttgtgtg tgcccctcat 240 
gcaaatcagt tcatcgaccc ggctttggta atgtcgcaaa cccgtttgct gaagacatca 300 
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gcgggaaagt cccgatccag aatgccttgt tttgttactg ctgagtcgag ttttaagaaa 3 60 
agatttatct ctttctttgg tcacgcaatg ggcggtattc ccgtgcctag aattcaggac 420 
aacttgaagc cagtggatga gaatcttgag atttacgctc cggacttgaa gaaccacccg 480 
gaaatcatca agggccgctc caagaaccca cagactacac cagtgaactt tacgaaaagg 540 
ttttctgcca agtccttgct tggattgccc gactacttaa gtaatgctca aatcaaggaa 600 
atcccggatg atgaaacgat aatcttgtcc tctccattca gaacatcgaa atcaaaagtg 660 
gtggagctct tgactaatgg tactaatttt aaatatgcag agaaaatcga caatacggaa 720 
actttccaga gtgtttttga tcacttgcat acgaagggct gtgtaggtat tttccccgag 780 
ggtggttctc atgaccgtcc ttcgttacta cccatcaagg caggtgttgc cattatggct 840 
ctgggcgcag tagccgctga tcctaccatg aaagttgctg ttgtaccctg tggtttgcat 900 
tatttccaca gaaataaatt cagatctaga gctgttttag aatacggcga acctatagtg 960 
gtggatggga aatatggcga aatgtataag gactccccac gtgagaccgt ttccaaacta 
1020 

ctaaaaaaga tcaccaattc tttgttttct gttaccgaaa atgctccaga ttacgatact 
1080 

ttgatggtca ttcaggctgc cagaagacta tatcaaccgg taaaagtcag gctacctttg 
1140 

cctgccattg tagaaatcaa cagaaggtta cttttcggtt attccaagtt taaagatgat 
1200 

ccaagaatta ttcacttaaa aaaactggta tatgactaca acaggaaatt agattcagtg 
1260 

ggtttaaaag accatcaggt gatgcaatta aaaactacca aattagaagc attgaggtgc 
1320 

tttgtaactt tgatcgttcg attgattaaa ttttctgtct ttgctatact atcgttaccg 
1380 

ggttctattc tcttcactcc aattttcatt atttgtcgcg tatactcaga aaagaaggcc 
1440 

aaagagggtt taaagaaatc attggttaaa attaagggta ccgatttgtt ggccacatgg 
1500 

aaacttatcg tggcgttaat attggcacca attttatacg ttacttactc gatcttgttg 
1560 

attattttgg caagaaaaca acactattgt cgcatctggg ttccttccaa taacgcattc 
1620 

atacaatttg tctattttta tgcgttattg gttttcacca cgtattcctc tttaaagacc 
1680 

ggtgaaatcg gtgttgacct tttcaaatct ttaagaccac tttttgtttc tattgtttac 
1740 

cccggtaaga agatcgaaga aatccaaaca acaagaaaga atttaagtct agagttgact 
1800 

gctgtttgta acgatttagg acctttggtt ttccctgatt acgataaatt agcgactgag 
1860 

atattctcta agagagacgg ttatgatgtc tcttctgatg cagagtcttc tataagtcgt 
1920 

atgagtgtac aatctagaag ccgctcttct tctatacatt ctattggctc gctagcttct 
1980 

aacgccctat caagagtgaa ttcaagaggc tcgttgaccg atattccaat tttttctgat 
2040 

gcaaagcaag gtcaatggaa aagtgaaggt gaaactagtg aggatgagga tgaatttgat 
2100 

gagaaaaatc ctgccatagt acaaaccgca cgaagttctg atctaaataa ggaaaacagt 
2160 

cgcaacacaa atatatcttc gaagattgct tcgctggtaa gacagaaaag agaacacgaa 
2220 

aagaaagaat ga 
2232 

<210> 231 
<211> 1194 
<212> DNA 

<213> Saccharomyces sp. 
<400> 231 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg tcgtcccagg tatttcctta 60 
ttgattttct tccagggatg ccttattctt ttgtttctcc aactcaccta taagactctt 120 
tactgtagaa atgatataag gaaacaaatt ggtctcaata aaaccaaaag attatttatt 180 
gtcttggtat catccatttt gcatgttgtc gcaccatctg cagtgagaat taccactgaa 240 
aattccagtg ttcctaaagg tacttttttt ttagacttga agaagaaaag gattctttct 300 
catctaaagt ccaattcggt ggccatttgc aatcaccaaa tatacacgga ttggatattt 3 60 
ttatggtggt tggcttacac atcgaactta ggggctaatg tcttcattat tttaaaaaaa 420 
tcgttggctt ccattcctat cctcggtttc ggtatgagaa actataattt catttttatg 480 
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agtagaaagt 
aatgcaaggg 
gagagcatat 
atcctattcc 
gctgccaaaa 
agatactcgt 
tactccggtg 
ttagaaggaa 
attccattag 
1020 

gatgctctaa 
1080 

cattcagtta 
1140 

ccaactctaa 
1194 



gggcacaaga 
gcgccggctc 
ggaatccgga 
ctgaaggtac 
taggcaaaaa 
tacaaaagtt 
taaaacagga 
aatacccgaa 
aggacgagaa 

tggaaaggta 

ccgatagttt 

caataatttg 



caaaataacc 
acttgctgga 
ggttattgat 
aaatctcagt 
gccattcaag 
gaagccaagt 
ggaatatggt 
gt tagtcgat 
tgaattttca 



ctaagcaaca 
aagtcacctg 
ccaaaacaaa 
gctgatacta 
aatgtgctac 
attgaaagtc 
gagcttatat 
attcacatca 
gaatggctgt 



gccttgctgg 
agcgcataac 
tccattggcc 
ggcaaaaaag 
tgcctcattc 
tttatgatat 
atgggctgaa 
gagcatttga 
ataaaatttg 



ccttgattcg 
tgaggaagga 
atacaatctt 
tgctaaatat 
tacaggccta 
tacgatcggc 
gagcatattt 
tgttaaagat 
gagtgagaag 



540 
600 
660 
720 
780 
840 
900 
960 



ctattccact ggatcattcg taagtgatcc tgaaacaaac 
caagatcaat cgtattgagt taactgaagt gctaatatta 
gttagtttat aaactttatt gttttatttt ttga 



<210> 232 
<211> 912 
<212> DNA 

<213> Saccharomyces sp. 



<400> 232 

atgagtgtga 

gcaggctgtg 

catttggctc 

cttgacgtca 

aatcaccaat 

gttactgcca 

ggtacatatt 

ttagaaaatg 

tacacgagtg 

ggtaagatcc 

tatggggtct 

aacttaacaa 

actttgaagg 

attgagtatg 

gtgccttctg 

aagatgcatt 



taggtaggtt 
gcttttacgg 
agtggattac 
aggtcgttgg 
ccaccttgga 
agaagtcttt 
tcttagacag 
ttaagaaaaa 
agctgacaat 
ccattgttcc 
tcaacagagg 
aggacaaaat 
agattggcta 
ccgctcttca 
tcagcattag 
aa 



cttgtattac 
tgtaatcgcc 
tgcgcgttgt 
cgaggagaat 
tatcttcatg 
gaaatacgtc 
atctaaaagg 
caagcgtgct 
gttgcctttc 
agtggttgtt 
ctgtatgatt 
tggtgaattt 
ctctcccgcc 
acatgacaag 
caacgatgtc 



ttgaggtccg 
tctatccttt 
ttttaccatg 
ttggccaaga 
ttaggtagga 
ccctttctgg 
caagaagcca 
ctatgggttt 
aagaagggtg 
tccaatacca 
gttagaattt 
gctgaaaaag 
atcaacgata 
aaagtgaaca 
aatacccata 



tgttggtcgt 
gcacgttaat 
tcatgaaatt 
agccatatat 
ttttcccccc 
gttggttcat 
ttgacacctt 
ttcctgaggg 
ctttccattt 
gtactttagt 
taaaacctat 
ttagagatca 
caaccctccc 
agaaaatcaa 
acgaaggttc 



actggcgctt 
cggtaagcaa 
gatgcttggc 
tatgattgcc 
tggttgcaca 
ggctttgagt 
gaataaaggt 
taccaggtct 
ggcacaacag 
aagtcctaaa 
ttcaaccgag 
aatggttgac 
accacaagct 
gaatgagcct 
atctgtaaaa 



60 

120 

180 

240 

300 

3 60 

420 

480 

540 

600 

660 

720 

780 

840 

900 

912 



<210> 233 
<211> 54 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 233 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaa 54 

<210> 234 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 234 

tcgaggatcc gcggccgcaa gcttcctgca gg 3 2 



<210> 235 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 
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<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 235 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 236 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 236 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 237 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 237 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 238 

<211> 36 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 238 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 

<210> 239 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 239 

cctgcaggaa gcttgcggcc gcggatcc 

<210> 240 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 240 

tcgacctgca ggaagcttgc ggccgcggat ccagct 

<210> 241 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 241 

ggatccgcgg ccgcaagctt cctgcagg 
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